DOCUMENT RESUME 



EP 265 224 



TM 860 091 



AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 
PUB DATE 
GRANT 
NOTE 

PUE TYPE 



EURS PRICE 
DESCRIPTORS 



IDENTIFIERS 



ABSTRACT 



Herman , Joan 

Report on the Revision of the CSE Evaluation Kit. 
Research into Practice Project. 

California Univ., Los Angeles. Center for the Study 
of Evaluation. 

National Inst, of Education (ED), Washington, DC. 
Dec 85 

NIE-G-83-0001 

255p.; For related documents, see ED 058 673 and ED 
175 887-94. 

Reports - Descriptive (141) — Guides - Non-Classroom 
Use (055) 

MF01/PCI1 Plus Postage. 

Achieveirant Tests; Attitude Measures; 'Educational 
Assessnvint ; 'Evaluation Methods; Evaluators; 
'Formative Evaluation; 'Program Evaluation; Program 
Implementation ; Statistical Analysis; 'Summative 
E vacation 

CSE 2valuatioi> Kit; National Institute of Education; 
♦Research into Practice Project 



".his document describes the revision of the Center 
for tha Study of. Evaluation (CSE) Program Evaluation Kit, including 
planning, deveJopment, and/or revisions of specific components. The 
kit i*3 a set of books, originally developed in 1978, and designed to 
provide step-'; y step procedural guides to help people conduct 
evaluations of. educational programs. To assure that the revision 
would accurst*]/ por*;ny evaluation theory and state of the art 
practice, an advisory committee of five individuals was established. 
The commit ce^ met at; an advisory board and agreed on the following 
changes to eight chapters: (1) Evaluation Handbook, provide 
orientation as to how the field has changed and emphasize on-gcing, 
Internal improvement-oriented evaluation; (2) Hov to Deal with Goals 
and Objectives, change title and revise substantially; (3) How to 
Design a Program Evaluation, rewrite introduction; (4) How to Measure 
Program Implementation, add brief overview and expand discussion; (5) 
How to Measure Attitudes, essentially unchanged; (6) How to Measure 
Achievement; change title and expand discussion; {7) How to Calculate 
Statistics, essentially unchanged; and (8) How to Present an 
Evaluation Report, change title and expand. How to Conduct a 
Qualitative Study, a new addition to the kit, was recommended. 
Authors provided drafts, several of which are appended to this 
document. (LMO) 



************************************* ********************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
********* «* **************************************** ******************** 



ERLC 



Center for the Study of Evaluation 



UCLA Graduate School of Education 

Los Angeles Califo^a 90024 



DELIVERABLE - DECEMBER 1985 
-3- RESEARCH INTO PRACTICE PROJECT 

rxj 

Report on tne Revision of the CSE 
p,j Evaluation Kit 

Q 




r 

0 

0 

0" 



9 

ERIC 



UA DEPARTMENT OF EDUCATION 

NATIONAL INSTITUTE OF EDUCATION 

EDUCATIONAL RESOURCES INFORMATION 

/ CENTER (ERIC) 

tYjh* document has been reproduced as 

received from the person or organisation 

ong mating »• 
lj Minor changes have been made to improve 

reproduction quality 



im 



ES 



1 Points of view or opinions stated in this docu 
ment do not necessarily represent officu.1 MIE 
position or policy 



■ 'T-RMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) *' 



DELIVERABLE - DECEMBER 1985 
RESEARCh PRACTICE PROJECT 



Report on the Revision of the CSE 
Evaluation Kit 



Project Director: Joan Herman 



Grant Number: NIE-G-83-O001 



CENTER FOR THE STUDY OF EVALUATION 
Graduate School of Education 
University of California, Los Angeles 



The project presented or reported herein was In part 
performed pursuant to a grant from the National Institute 
of Education, Department of Education. However, the 
opinions expressed herein do not necessarily reflect the 
position or policy of the National Institute of Education, 
and no official endorsement by the National Institute of 
Education should be inferred. 



Report on the Revision of the CSE Evaluation Kit 



CSE's Program Evaluation Kit is a set of books providing step-by-step 
procedural guides to help people to conduct evaluations of educational 
programs. Originally developed under a grant with the National Institute 
of Education and copyrighted in 1978, the Program Evaluation Kit is 
published by Sage Puolications and includes the following eight books: 
1. The Evaluator' s Handbook serves as a organizing framework for tne 
entire kit, taking the potential evaluator step-by-step through a generic 
procedure for conducting formative and summative evaluations. It also 
provides a directory to the rest of the kit. The introduction in chapter 
one calls attention to critical issues in program evaluation. Chapter 2, 
How to Play the Role of Formative Evaluator, describes th? diversified job 
responsibilities of this role. Chapters 3, 4, and 5 contain step-by-step 
guides for organizing and accomplishing three types of evaluations: 

° A formative evaluation calling for a close working relationship with 

the staff during program installation and development (Chapter 3) 
° A standard summative evaluation based on measurement of achievement, 

attitudes, and/or program implementation (Chapter 4) 
° A small experiment, a procedure most likely to be of interest to a 

researcher or to the evaluator who wishes to either conduct pilot 

tests or evaluate a program aimed toward a few measurable objectives 

(Chapter 5) 

The Handbook concludes with a Master Index to topics discussed throughout 
the Kit, 

1. How To Deal With Goals and Objectives provides advice about using goals 
and objectives as methods for gathering opinions about what a program 



should accomplish. The book then describes how to organise the evaluation 
around them. It suggests ways to find or write goals and objectives, 
reconcile objectives with standardized tests, and assign priorities to 
objectives. 

3. How To Design a Program Evaluation discusses the logic underlying the 
use of research designs - including the ubiquitous pretest-posttest design 
- and supplies step-by-step procedures for setting up experimental, 
quasiexperimental, and time series designs to underpin the collection of 
evaluation data. Six designs, including some unorthodox ones, are 
discussed in detail. The book outlines the use of esch design, from 
initial choice of program participants to analysis and presentation of 
results. Finally, it includes instructions about how to construct random 
samples. 

4. How To Measure Program Implementation presents step-by-step methods for 
designing and using measurement instruments - examination of program 
records, observations, and self-reports - to accurately describe how a 
program looks in operation. The first chapter discusses why measuring 
implementation is important and suggests several points of view from which 
you might describe implementation, for instance, scrutinizing the 
consistency o the program with what was planned or writing a naturalistic 
description free of such preconditions. Its second chapter is an outline 
of the implementation section of an evaluation report. 

5. How To Measure Attitudes should help the evaluator select or design 
credible instruments for atcicude measurement. The book discusses problems 
involved in measuring attitudes - including peoples 1 sensitivity about this 
kind of measurement and the difficulty of establishing the reliability and 
validity of individual measures. It lists myriad sources of available 



attitude instruments and gives step-by-step instructions for developing 
questionnaires, interviews, attitude rating scales, sociometric 
instruments, and observations schedules. Finally, it suggests how to 
analyze and report results f^om attitude measures. 

6. How To Measure Achievement focuses primarily on the tests administered 
for program evaluation. The book can be used in several ways. In case you 
plan to purchase a test, it helps you find a published test to fit your 
evaluation. To this effect, the book lists anthologies and evaluations of 
existing norm- and criterion-referenced tests and supplies a Table for 
Program-Test Comparison. The step-by-step procedure for completing this 
table directs you to compute numerical indices of the match between a 
particular test and the objectives of a program. If you want to construct 
your own achievement test, the book presents an annotated guide to the vast 
literature on test construction. Chapter 4 lists, as well, test item banks 
and test development and scoring services. The final chapter describes how 
to analyze and present achievement data to answer commonly-asked evaluation 
questions. 

7. H ow To Calculate Statistics is divided into three sections, each 
dealing with an important function that statistics serves in evaluation: 
summarizing scores through measures of central tendency and variability, 
t sting for the significance of differences found among performances of 
groups, and correlation. Detailed worksheets, ordinary language 
explanations, and practical examples accompany each step-by-step 
statistical procedure. 

8. How To Present an Evaluation Ri ^ort is designed to help you convey to 
various audiences the information that has been collected during the course 
of the evaluation. It contains an outline of a standard evaluation report; 



4 



directions and hints for formal and informal, written and oral, reporting; 
and model tables and graphs, collected from the Kit's design and 
measurement books, for displaying and explaining data. 

Over the years, the kit has been widely used in the field and has been 
an important resource for training evaluators and for helping those charged 
with evaluation responsibilities to complete their tasks. In fact, over 
15O,0OG units of the kit have been sold since it was first published. 
Although the kit continues to oe distributed and to provide service, it has 
been over ten years since it was first developed, and during that time the 
field of evaluation has matured and changed considerably. 

So too, the CSE evaluation kit needed to be changed to reflect these 
many changes and to continue to provide an updated, easy to follow resource 
for evaluation practitioners, a conclusion reached by jointly by CSE, its 
advisors, and Sage Publications. Consequently, CSE reques:ed and received 
permission from the N IE to use resources from the Research Into Practice 
Project in partial support of the revision effort. This document describes 
the efforts supported with NIE funds, including planning for the revision 
and the development and/or revision of specific components. 
Planning for the revision 

A important component of the planning effort was the establishment of 
an advisory committee to assure that the revision would accurately portray 
evaluation theory and state of the art practice as it has evolved over the 
last fifteen years. Five individuals were approached and agreed to serve 
in this advisory role. Each of these individuals has played a prominent 
role both in conceptualizing the guiding theories and methodologies for the 
field and in the conduct of evaluation practice. They include: 



9 

ERIC 



8 



0 Robert Boruch, Northwestern University 
0 Ernie House, University of Illinois 
0 Gene Glass, University of Colorado 
0 Michael Patton, University of Minnesota 
0 Carol Weiss, University of Arizona 

An advisory board meeting was convened to discuss potential revisions 
to the kit. Consensus was researched on the need for the following 
changes: 

1. Evaluation Handbook: Provide orientation to how the fiel of 
evaluation has changed over the last 10 years, moving from an emphasis on 
mandated, external evaluations of federal and state-supported programs to 
concern with on-going, internal improvement-oriented evaluations of local 
programs; from concentration on experimental, quantitative methods to a 
consideration of more qualitative approaches; from near exclusive build-in 
sensitivity to stakeholders and potential utilization throughout the 
evaluation process, and to the influence of political and ethical issues. 

2. How to Deal with Goals and Objectives: Change title to "How to Define 
Evaluation Goals" or "How to Define Your Role As Evaluator." A shortened, 
substantially revised version of the current test would constitute only one 
part of the new bo^k; other concerns to be dealt with would include- 
considering different potential purposes for the evaluation; framing 
evaluating questions based on client needs and theoretical predispositions; 
identifying and gathering input from a variety of stakeholders; setting 
priorities, etc. 

3. How to Design a Program Evaluation: Rewrite introduction to the 
emergence and credibility of more qualitative approaches; refer reader to 



qualitative methods boo< for design considerations for the qualitative 
context. 

4. How to Measure Program Implementation: Add brief overview of the 
naturalistic research paradigm earl) in the book, spelling out the 
definitions of and differences between such things as naturalistic 
research, qualitative research, ethnographic research, responsive 
evaluation, etc.; refocus and enlarge current charactoriza tion of 
naturalistic/responsive observation, giving attention to important design 
issues (3.g. a priori vs. a posteriori design); expaid discussion of 
observation and of focused interviewing; include discussion of data 
reduction and analysis. 

5. How to Measure Attitudes: Essentially OK as is; examples outside of 
education may be helful; need exterral review. 

6. How to Measure Achievenient: Change title to "How to Measure 
Performance. 11 Current discussion of achievemen: measures would be expanded 
to include performance measures an^ indicators in education and other 
fields. Test would consider issues such as de:iding what to measure; 
sources and type of measures; selection criteria; procedures for 
constructing measures. 

7. How to Calculate Statistics: Essentially OX as is; may want to 
consider adding more complex analysis strategies. 

8. How to Present an Evaluation Report: Change v.o "How to Communicate 
Evaluation Findings" or "How to Report to Decisionmakers." In addition to 
dealing with how to write a report at the end of the project tnat is 
sensitive to user needs, th<; book would be expanded to consider reporting 
and communication needs throughout the evaluation process. Included would 

10 



7 



be identification of target audiences and their information needs, analysis 
of evaluation and program contaxt, formulation of strategies and timing for 
communicating evaluation results. 

9. How to conduct a Qualitative Study: A new addition to the kit series, 
to include general orientation to qualitative strategies, guidelines on 
when to use qualitative methods and step-by-step procedures for design, 
observation, interviewing, and analyses and syntheses of data. 
Development of Revised Manuscripts 

Potential authors to accomplish the changes identified above were 
identified and contacted to determine their availability and interest. As 
a result of these interactions, the following individuals agreed to take 
responsibilities as follows: 

1. Evaluation Handbook - Joan Herman & Merv Alkin (Editors for entire 
revision) 

2. how to Define Your Role as An Evaluator - Brian Stecher, ETS 

3. How to Design a Program Evaluation - Joan Herman 

4. How to Measure Program Implementation - Jean King, Tulane 
University 

5. How to Measure Performance - Joan Herman 

6. How to Communicate Evaluation Findings - Phyllis Jacobson, 
Fillmore Unified School District 

7. How to Conduct A Qualitative Study - Michael Patton, University of 
Minnesota 

Of these revisions, drafts of #1-4 and 6 wjre to be accomplished 
within the grant period, with principal emphasis on the Evaluation 
Handbook. Authors of each of these revisions were asked to prepare a 



8 



detailed outline. After these outlines were reviewed by the series editors 
and modified as necessary, authors started their writing tasks. Drafts of 
these manuscripts are appended 4 ,i the following sections. Please note that 
the publisher (SAGE) is not requiring new camera-ready copy for the 
revisions. Therefore, in the interests of economy, ze-^xed copies of 
portions of the current kit arc included within the appended manuscripts. 



1 1 



EVALUATION HANDBOOK 
(DRAFT) 



Revision Editors: Jo?.n Herman 

Marv Alkin 



December 6, 1985 



O 13 

ERIC 



TABLE OF CONTENTS 

INTRODUCTION 

Components of the kit 
Kit vocabulary 

CHAPTER ONE ESTABLISHING THE PARAMETERS OF AN EVALUATION 

An Evaluation Framework 

Conceptualizing the Evaluation 

How to Determine a General Technical Approach 

to an Evaluation 
How does an Evaluator Decide what to Measure? 

CHAPTER TWO HOW TO PLAY THE "OLE OF AN EVALUATOR: 

A Review of Formative and Summative Evaluations 

Agenda A: Setting Boundaries for the Evaluation 

Agenda B: Selecting Appropriate Design anu Measurements 

Agenda C: Data Collection and Analysis 

Agenda D: Final Reports 

CHAPTEP THREE STEP-3Y-STEP GUIDES FOR CONDUCTING AN EVALUATION 

Agenda A 
Agenda B 
Agenda C 
Agenda D 

CHAPTER FOUR STEP-BY-STEP GUIDE FOR CONDUCTING A SMALL EXPERIMENT 
APPENDIX A: INDEX TO THE PROGRAM EVALUATION KIT 



II 



INTRODUCTION 



The Program Evaluation Kit is a set of books intended to assist people 
who are conducting program evaluations. Its potential use is broad. The 
kit may be an aid both to experienced evaluators and to those who are 
encountering program evaluation for the first time. Each book critains 
step-by-step procedural guides to help people gather, analyze, and 
interpret information for almost any purpose, whether it be to survey 
attitudes, observe a program in action, or measure outcomes in an elaborate 
evaluation of a multi-faceted program. Examples are drawn from 
educational, social service, and business settings. 

In addition to suggesting step-by-step procedures, the kit also 
explains concepts and vocabulary common to evaluation, making the kit 
useful for training or staff development. 
CUMPONENTS OF THE KIT 

The Program Evaluation Kit consists of the following nine books, each 
of which may be used independently of the others. 

1. This, The Evalua tor's Handbook , provides an overview of evalulation 
activities and a directory to the rest of the kit. Chapter 1 suggests 
an evaluation framework which is based upon common phases of program 
development. Chapter 2 discusses things to consider when trying to 
establish the parameters of an evalation. Chapter 3 presents specific 
procedural agendas for conducting evaluations. Chapters 4, 5, and 6 
contain specific guides for accomplishing three general types of 
evaluations:: A formative evaluation, a standard summative evaluation, 
and a small experiment. 



The Handbook concludes with a Master Index to topics discussed 
throughout the Kit. 

2. How to Define Your Role as an Evaluator provide? advice about focusing 
an evaluation, that is, deciding upon the major questions the 
evaluation is intended to answer and identifying the principal audience 
for the evaluation. 

3. How to Design a Program Evaluation discusses the logic underlying the 
use of research designs — including the uiquitous pretest-posttest 
design— and supplies step-by-step proce ' "^es for setting up and 
interpreting the results from experimental, quasi-experimental, and 
time series designs. Six designs, including some unorthodox ones, are 
discussed in detail. Finally, the book includes instructions about how 
to construct random samples. 

4. How to Use Qualitative Methods in Program Evaluation explains the basic 
assumptions underlying qualitative procedures and suggests the most 
appropriate situations for using qualitative designs in evaluations. 
(To be revised upon review of Michael Patton's manuscript). 

5. How to Measure Program Implementation presents methods for designing 
and using measurement instruments --examination of program records, 
observations, and self -reports — to accurately describe how a program 
looks in operation. (To be revised upon completion of Joan King 1 s 
manuscript. 

6. Kow to Measure Attitudes should help an evaluator select or design 
credible instruments to measure attitudes. The book discusses problems 
involved in measuring attitudes, including peoples 1 sensitivity about 



9 

ERLC 



16 



tnis kind of measurement and the difficulty of establishing the 
reliability and validity of individual measures. It lists myriad 
sources of available attitude instruments and gives precise 
instructions for developing questionnaires, interviews, attitude rating 
scales, sociometric instruments, and observation schedules. Finally, 
it suggests how to anlayze and report results from attitude measures. 

7. How to Measure Performance (complete the descsription upon review of 
the manuscript). 

8. How to Calculate Statistics is divided into three sections, each 
dealing with an importnat function that statistics serves in 
evaluation: summarizing scores through measures of central tendency 
and variability, testing for the significance of differences found 
among performances of groups, and correlation. Detailed worksheets, 
non-technical explanations, and practical examples accompany each 
statistical procedure. (This may be revised based upon Gene Glass f s 
suggestions. ) 

9. How to Present Lvaluation Findings is designed to help evaluator convey 
to various audiences the information that has been collectaed during 
the course of the evaluation. It contains an outline of a standard 
evaluation report, directions for written and oral reporting, and model 
tables and graphs. 

KIT VOCABULARY 

For those who have had little experience with evaluation, it might be 
helpful *o review a few basic terms which are used repeatedly throughout 
the P rogram Evaluation Kit . A PROGRAM anything you try because y^u think 



ERJC 



17 



it will have an effect. A program might be something tangible such as a 
set of curriculum materials or a procedure, like the distribution of 
financial aid or an arrangement of roles and responsibilities, such as the 
reshuffling of adn " *rative staff. A program might be a new kind of 
scheduling, like a four day work week; or it might be a series of 
activities designed to improve workers attitudes about their jobs. A 
program is anything definable and repeatable. 

When you EVALUATE a program, you systemati cal ly collect information 
about how the program operates about the effects it may be having and/or to 
answer other questions of interest sometimes the information collected is 
used to make decisions about the program, for example, how to improve it, 
whether ^ expand it, or whether to discontinue it. Sometimes evaluation 
informatics has only indirect influence on decisions, sometimes it is 
ignored altogether. Regardless of how it is ultimately used, program 
evaluation requires the collection of valid, credible information about a 
program in a manner that makes it potentially useful. 

Generally an evaluation has a SPONSOR. This is ^he individual or the 
organization who requests the evaluation and usually pays for it. If the 
members of a school board request an evaluation, they are the sponsors. If 
a federal agenc; requires an evaluation, the agency is the sponsor. 

Evaluations always have AUDIENCES. An evaluation's findings are of 
course reported to sponsors, but there might be other people interested n 
or directly affected by the findings. A common audience for information 
collecterd during program develoment might consist of program planners, 
managers, and staff who run the program. Another audience might be the 



18 



recipients of the services or products; for example, students, parents, or 
customers. If the program will be expanded to additional sites, or if it 
is reported in widely circulated publications, then che broader scientific, 
educational, public service or business community comprises an evaluation 
audience. In short, audiences are the groups that you will have to keep 
in mind as you conduct the evaluation. If your audiences share a common 
point-of-view about the program or are likely to find the same evaluation 
information credible, consider yourself lucky. This is not always the 
case. 

For some evaluations, of course, the roles of evaluator, sponsor, and 
audience are all played by the same people. If teachers or managers decice 
to evaluate their own programs they will be at once the sponsors, the 
audience, the program managers, and the evaluators. Although the kit 
treats these roles as distinct, it is understood that people sometimes fill 
overlapping functions. 

One decision that an evluator makes affects the credibility of the 
evaluation for many audiences. This is the selection of an EVALUATION 
DESIGN, a plan determining what individuals or groups will participate in 
the evaluation, what types of data will be collected and when evaluation 
instruments or measures wi~. i be administered and to whom. The instruments 
could include tests, questionnaires, observations, interviews, inspections 
of records, etc.) The design provides a basis for better understanding the 
program ana its effects. More traditional quantitative designs focus 
primarily on measuring program results and comparing them to a standard. 
Such comparisons (including other programs) give some perspective about the 



19 



magnitude of a program's effect and helps the evaluate and relevant 
audiences determine whether it indeed is the program which brings about 
particular outcomes. In contrast, newer qualitative designs focus on 
describing the program in depth and on better understanding the meaning and 
nature of its operations and effects. 

The focus of the Program Evaluation Kit is the collection, analysis, 
and reporting of valid, credible information which can have some 
constructive impact on program decisionmaking. 



20 



CHAPTER I 

ESTABLISHING THE PARAMETERS OF AN EVALUATION 



AN EVALUATION FRAMEWORK 

Literature in the changing field of program evaluation has been marked 
by various evaluation models which serve to conceptualize the field and to 
draw boundaries on the evalutor's role. Descriptions of some of the more 
prominent educational evaluation models appear in Table 1. If you plan to 
spend considerable time working as an evaluator, the references in this 
table and the readings listed ir ".he For Further Reading section at the end 
of the chapter should help you v'atch up on what evaluators have said about 
their craft. 

This kit has drawn its prescriptions about how to conduct program 
evaluations from most of the models in Table 1, Each model is appropriate 
to a particular set of circumstances and since the kit's purpose is to help 
you decide what to do in different situations, its advice is eclectic, 
borrowed from various models. 

Most of the evaluation models described in Table 1 outline the 
technical procedures their proponents believe should be followed in 
evaluations; some also consider the socio-political factors which need to 
be considered. The Program Evaluation Kit ?«so shows this dual concern. 
It focuses not only on how to accomplish the technical requirements of an 
evaluation but also on how to structure the evaluation to facilitate the 
use and impact of its findings. This pragmatic perspective reflects a 
common observation that evaluation since the early 1960 1 s has been grossly 
underutilized. 



21 



Early evaluation rr^deH reflected a general optimism that systematic, 
scientific procedures woull deliver unequivocal evidence of program success 
or failure, "Ha^d data" couid provide both sound information for planning 
more effective programs and a ration', basis for educational 
decisionmaking. It was assumed that clear caise-effect relationships could 
be established between programs and their outcomes and that program 
variables could be manipulated to **each desired effects. In light of these 
hopes, thousands of evaluations were conducted throughout the 1960's and 
1970' s. Unfortunately, most of these evaluations did not have the expected 
impact, and many have questioned whether these evaluations had any impact 
at all. Believing in the potential contribution of their work to 
educational planning and policy, evaluators became concerned about how to 
have their findings used and not simply filed. At the san*e time, they came 
to realize that all social r grams are not discrete entities with easily 
recognizable stages in a predetermined p* ~cess of natural development* 
Programs are often amorphous, complex mobilizations of human activities and 
resources, embedded in political and social networks. It is a rare program 
which exists in hermetically sealed isolation, perfectly appropriate for 
scientific measurement and duplication. 

The Program Evaluation Kit reflects the need for a flexible approach 
that considers the complex environment in which a program exists as well as 
the purpose and context of its evaluation. An evaluator must be aware of 
the decision-making context within which the evaluation is to occur. She 
must consider the perceptions and expectations of various audiences, the 
developmental phase of the program under investigation, as well as the 
technical issue of which methodology to use in gathering data. 



Evaluations are quite situation specific, but some generalizations or 
rules of thumb can be offered about how to conduct them. The following 
sections will explain four general phases during the life of a program 
when evaluations are commonly conducted. The phases are certainly not 
clearly separate. They quite often overlap, and some programs skip certain 
phases entirely. The evaluation^ during each phase differ according to 
their primary audiences, according to the decisions which sponsors will 
have to rrake, according to the timing of data collection and reporting, and 
according to the general relationship of the evaluator to the program 
during the course of the evaluation. 
Program Initiation 

Early in the development of a program, sponsors, managers, and 
planners consider the goals they hope to accomplish through program 
activities, and identify the needs and/or problems that a program is 
supposed to redress. Formally or informally, every program, in fact, goes 
through some kind of needs assessment even though it may not be obvious 
whose needs are being defined nor that the process is very rigorous. 
In some cases, the needs are simply assumed, and planners proceed to 
structure activities accordingly. At other times, the sponsor or funding 
agency more or less declares a need by making money available for programs 
aimed at general goals. Sometimes, however, a systematic effort is made to 
verify that perceived neeas actually exist, to priortize their importance 
and/or to identify specific underlying problems. If a school program is 
intended as a response to community needs, for example, and evaluator 
conducting a needs assessment may gather information broadly from parents, 



i 



23 



teachers, students, and a sample of the broader community. Similarly in 
trying to help structure a progran to increase staff morale, an evaluator 
may observe closely and broadly survey employees, their supervisors, 
experts, and others in order to uncover the source of the morale problem 
and its potential solution. Such formal needs assessments often try to 
gather input from a broad range of sources. Sometimes, however, a more 
restricted approach is pre arable, e.g., where needs are very specialized 
or highly technical. In such cases, an evaluator might only solicit the 
opinions of experts. 

The point is that programs are often initiated in response to critical 
needs, to achieve high priority goals, or to solve existing problems. 
Program Planning 

A second phase in the life of a program is its planning. Ideally, a 
program is designed to meet the highest priority goals established by a 
needs assessment. At times, the r >d to reach certain goals will prompt 
planners to design a new program from scratch, putting together materials, 
activities, snd administrative arrangements that have not been tried 
before. Other situations will require that they purchase and install, or 
moderately adapt an already existing program. Both situations qualify as 
program planning — something that has ,iot occurred previously in the 
setting is created for the purose of meeting desired goals. During this 
phase, controlled pilot testing and field tests can be used to determine 
the effectiveness and feasibility of alternative methods of addressing 
primary needs and goals. While it is desirable, at this point, to 
establish plans for conducting evaluations, practice rarely meets this 
ideal. 



24 



Program Implementation 

The third phase occurs as the program is being installed. Suppose, 
for example, that urban planners want to try out a new management 
information system. Purchases are made, boxes delivered, and training 
planned. This will be the first year of the new system. Ideally, the 
program's sponsors should give the new system a chance to make mistakes, 
solve problems, and reach the point where it is running smoothly before 
they decide how good or bad it is. All the time a program is in this 
implementation stage, subject to trial and error, the staff is trying to 
operational ize it suitably and revise it as necessary to meet their 
particular situation. Evaluations during this phase need to be formative, 
that is, seeking to describe how the program is operating and to suggest 
ways to improve it. Formative evaluation can take many forms such as 
special surveys of program services, enthnographic studies, or analyses of 
administrative records to determine how the program actually operates. In 
this formative case, the evaluator may work very closely with the program 
staff and report both formally and informally about findings as they 
emerge* 

Program Accountability 

When a program has become established with a permanent budget and an 
organizational niche, it might be time to question its overall impact. 
Judgments may need to be made about whether or not to continue the program, 
whether it should be expanded, and whether it might be used to other 
sites. During this phase, the evaluation i s summative and it is of most 
direct concern to program policy mak,"s. 



25 



Ideally, because the summative evaluato* represents the interests of 
the sponsor and the broader community, he should try not to interfere with 
program operations. The summative evaluator's function is not to work with 
the staff and suggest improvements while the program is running, but rather 
to collect data and write a summary ^eport showing what the program looks 
like and what has been achieved. Such ideal detachment, however, is rarely 
possible and even the most detached findings can serve a useful formative 
purpose. Evaluators, in fact, are often expected to serve both formative 
and summative functions, seeking to contribute to program improvement and 
to provide summary judgements of program worth. Such expectations must be 
approached cautiously as some objectivity may be lost if a summative 
evaluator scrutinizes a program in which she has developed a personal 
stake. 

In recent years, organizations have turned mo**e and more toward hiring 
evaluators who are permanent members of the staff. These internal 
evaluators generally perform formative functions. They are often an 
adjunct to management, working to increase organizational efficiency and 
effectiveness on a regular basis. However, care must be taken that the 
evaluator, regardless of where or how he is employed, maintain 
integrity, objectivity, and an appropriate sense of differentiation. 
CONCEPTUALIZING THE EVALUATION 

"We'd better have an evaluation of Program X," someone coulu decide 
and then appoint you to carry out that decision. Proceed with this 
caution: 



9 

ERLC 



26 



Your first act in response to this assignment should be to find out 
what evaluation means in this instance. Find out what is expected. 
What • formation will the evaluation be expected to provide? Does the 
sponsor or another audience want more information than you can 
uossibly provide? Do t.h«?y want definitive statements that you will 
not be able to make? Do they want you to take on an agnostic or 
advocate roie toward the program that you cannot in good conscience 
assume? 

Failure to "•each a common understanding about the exact nature of the 
evaluation could lead to wasted money and effovc, frustration, and acrimony 
if sponsors feel they did not get what they expected. Step one in any 
evaluation is to negotiate ! 

Immediately after accepting the assignment, try to get a clear picture 
of what you will be expected t do. This conceptualization will have six 
major considerations, each negotiated with the sponsor and the audiences: 

1. A decision about what people really want when they say they 
want an ^valuation. 

2. Identification of what the audience" will accept as credible 
information. 

3. Choice of a reporting style. This may include the extent to which 
you report quantitative or qualitative information, whether you 
will write technical reports, brief notes, or confer with the 
staff, a.id the timing of important reports. 

4. Determination of a general technical approach based upon 
information and credibility needs. 

5. A decision of what to measure an or gather information about. 



ERLC 



27 



6. l/elineation of what you can accomplish within the constraints of 

the evaluation's budget and political situation. 
Each of these six considerations will be discussed in more detail 
below. 

Determining What People Really Want 
when They Say They Want an Evaluation 

The sponsor who commissions the investigation might nave in mind any 

one of several kinds of activities that could be called evaluations. They 

are all closely related, and in some cases more than one may be required 

for a single project. In general, the activities may be Massif ied loosely 

in„o f1v- types of evaluations based upon th; ultimate use of the 

findings. A request for an evalution may actually be a charge to collect 

information: 

* To conduct a needs assessment. 

* To describe what a program looks like in operation. 
This is an implementation evaluation. 

* To measure whether goals have been achieved. 

* To help managers plan the program and keep it running smoothly. 
This is a formative evaluation. 

* To help the sponsor and others in authority decide the program's 
ultimate fate. This is a summative evaluation. 

Each of these activities requires a somewhat different approach and 
various amounts of time and nioney, so it is crucial that the sponsor's 
primary purposes for having the evaluation done be madd as clear as 
possible. How will the findings be put to use? Who is the main audience? 
At what general stage in the development of the program 1s the evaluation 



9 

ERIC 



28 



taking place? Through frequent interactions with the sponsor early in the 
study, identify and focus the relevant evaluation questions. 

The boxes on pages — to — describe the five kinds of 
investigations usually conducted undtr the title evaluation . Each is 
characterized by the types of questions which the sponsor and the evaluator 
might typicaly consider, by the general activities that could be expected 
to occur, and by the decisions that might be affected. Note that 
recommendations for conducting formative and summative evaluations 
encompass the activities required for needs assessment, program 
implementation evaluation, and assessment of goal achievement. The Program 
Evaluation Kit contains enough information to help you perform any of the 
five types. 



9 

ERLC 



CHART 1 NEEDS ASSESSMENT 
(Also called an Organizational Review) 

Significant questions for sponsors and evaluators ; 
*What are the goals of the organization? 
*What should the program(s) try to accomplish? 

Can goal priorities be determined? 

Is there agreement on the go 1 ^ froir, all groups? 

To what extent are goals being met? 

What in the organization is succeeding or failing? 

Is there a need to establish new programs or to revise 

old ones based upon identified needs? 
Acti v ities : 

The evaluator might discover that the aim of the evaluation is neither 
to decide between continuing or dropping a program nor to develop detailed 
activities to improve a program as it proceeds. Rather, a sponsor wants to 
discover problem areas in the current situation which might eventually be 
remedied. A needs assessment often precedes specific program planning and 
can be used to re-examine existing goals and/or to make implicit goals mere 
explicit. 

Decisions and actions likely to follow a needs assessment 

The decisions following a needs assessment usually involve allocation 
of resources to meet high priority needs. New programs may be planned or 
old ones revised to address the identified needs. The survey of needs is 
itself the end product in this type of evaluation, unlike a formative 



ERLC 



evaluation, where the evaluator works with the organizational staff to 
improve identified weaknesses during the course of the investigation. 
Kit components of greatest relevance : 

How to Define Your Role as an Evaluator 

How to Measure Achievement 

How to Measure Attitudes 

How to Measure Program Implementation 
(To be revised upon revision of kit components) 



31 



CHART 2 PROGRAM IMPLEMENTATION EVALUATION 
(Also called a Program Documentation) 



Significant questions for sponsors and evaluators : 
*What is happening in Program X? 
*Is the program being implemented according to plan? 
*What do participants 1n the program experience? 
How many and which participants and staff are taking part? 
What is a typical schedule of activities? 
How are time, money, and personnel allocated? 
How much does the program vary from one site to another? 
Activities : 

A description of program implementation focuses on the activities, 
materials, and administrative arrangements that comprise a program. It 
does not include an examination of the results of program activities as 
would a formative or a summative evaluation. The audience wants a 
description of who is doing what in the program or of how a requirement has 
been interpreted by the program planners and developers across sites. Be 
sure to make it clear that program activities will not be related to 
outcomes. For many audiences, a description of what is taking place is 
sufficient information for making decisions about the program. This is 
particularly true when the program is designed to reflect a philosophy of 
how organizations should be run in order to acheive long-term goals. 
Decisions and actions likely to follow an implementation study : 

The information from the evaluation may be included in a larger 
formative or summative investigation. Sponsors are likely to judge the 



program on the basis of whether or not they think the activities occurring 
are valuable in themselves or would probably be effective in achieving 
other goals. 

Kit components of greatest releva nce 

How to Measure Program Implementation 

How to Design a Qualitative Evaluation 
(To be revised when the kit components are revised) 



33 



CHART 3 MEASUREMENT OF GOAL ACHIEVEMENT 



Significant questions for sponsors and evaluators : 

*Is Program X meeting its goals? 

*Is the program meeting its goals? 
How can goal attain.nent be measured most credibly? 
Activities : 

The evaluator attempts only to measure the extent to which the 
program's highest priority goals are being achieved. It is important to 
emphasize that you will not be able to state whether the program alone is 
responsible for the observed results and certainly not whether some other 
program would have been better. Even though looking at goal achievement 
alone usually provides a poor basis for judging a program's comparacive 
merits, your result r can still be of some use. Determining the extent to 
which achievement matches a set of carefully considered standards does give 
a basis for at least tentative conclusions about the program's quality. 
Decisions and actions like 1 *• to follow measures of goal achievement : 

Planners may choose to reconsider goals and to focus program 
activities more appropriately to achieve significant goals. The 
information from this type of evaluation might be used in a more extended 
formative or summative investigation. 
Kit components of greatest relevance : 

How to Measure Achievement 

How to Measure Attitudes 
(To be revised when kit components are revised) 



3-1 



CHART A FORMATIVE EVALUATION 



Significant questions for sponsors and evaluators : 
*How can the program be improved as it develops? 
What are the program's goals and objectives? 
What are the program's most important characteristics-materials, 
activities, administrative arrangements? 

How are the program activities supposed to lead to attainment of the 
objectives? 

Are the program's important characteristics being implemented? 

Are they leading to achievement of the objectives? 

What adjustments in the program might lead to the attainment of the 

objectives? 

Which activities are best for each objective? 
Are some better suited to certain participants? 
What measures and designs could be recommended for use during a 
summative evaluation of the program? 
Activities : 

Formative evaluation encompasses the thousand-and one jobs connected 
with providing information for the staff to get the program running 
smoothly. It might even include conducting a needs assessment. 
Cercainly it will involve sone attention to monitoring program 
implementation and achievement of goals. In order to improve a 
program, it will be necessary to understand how well a program is 
moving toward its objectives so that changes can be made in the 



ERIC 



35 



program's components. Formative e\aluation is time-consuming because 
it requires becoming familiar with multiple aspects of a program and 
providing program personnel with information and insights to help them 
improve it. Before launching into formative evaluation, maKe sure 
that their is actually a chance of making changes for improvement - if 
no such possibility exists, formative evaluation is not appropriate. 

De cisions and actions likely to follow a formative evaluation : 

As a result of formats evaluation, revisions can be made in the 
materials, activities, and organization of the program. These 
adjustments are made throughout the course of the evaluation. 

Kit components of greatest relevance : 
All of them. 



36 



CHART 5 SUMMATIVE EVALUATION 



Significant questions for sponsors and evaluates ; 

*Is Program X worth continuing or expanding, or should it be 
discontinued? 

What are Program X's most important characteristics (materials, 
activities, administrative arrangements, etc.)? 
Do the activities lead to goal achievement? 
What programs are available as alternatives to Program X? 
How effective is Program X in comparison with alternatives? 
How costly is the program? 
Activities : 

The goal of summative eval ion is to collect and to present 
information needed for summary statemerts and judgments of the program and 
its value. The evaluator should try to provide a basis against which to 
compare the program's accomplishments. One might contrast the program's 
effects and costs with those produced by an alternative program that aims 
toward the same goals. In situations where such a comparison is not 
possible, participants 1 performance might be compared with a group 
receiving no such program at all. The standard for comparison might come 
from the norms of achievement tests or from a comparison of program results 
with the goals identified by the program designers of the community at 
large. 

In some instances, summative evaluation is not appropriate. A summary 
statement should not be written, for instance, about a program that has not 



37 



been in existence long enough to be fully developed. The more a program 
has clear and measurable goals and consistent repiicable materials, 
organization, and activities, the more suited it is for a summative 
evaluation. 

Decisions and actions likely to follow a summative evaluation : 

Decision makers may use information from summative evaluations to help 

them decide whether to continue or to discontinue a program or whether to 

expand it or reduce it. 

Kit components of greatest relevance : 
All of them. 



33 



What Will Be Accepte d as Credible? 

In addition to finding out what your audiences want to know, you will 
need to discover what they will accept as credible information. The 
credibility of the evaluation will, of course, be influenced by your own 
credibility, a judgment that will be based on your perceived competence as 
well as your personal style. For some, your perceived competence in 
technical skills or in reporting may be most important. For others, your 
expertise in program subject matter may be a primary consideration. The 
audience will be less skeptical if they are confident you know what you are 
doing. A skilled evaluator also has excellent interpersonal skills and is 
able to nurture trust and rapport with various users and audiences. 

Your audiences' willingness to accept without question what you report 
will be based on other criteria as well. For one thing, they will take 
account of your allegiances. An evaluator must be perceived as free to 
find fault — whether or not she does. This means that you should not be 
constrained by friendship, professional relations, or the desire to receive 
future evaluation jobs. In addition, audiences will believe what the 
evaluator reports to the extent that they see her as representing 
themselves. The program staff, for instance, may be suspicious of a 
formative evaluator who will write a summary report at year's end to the 
funding agency. The agency, on the other hand, will read report 
suspiciously if it suspects that the evaluator's formative work has put her 
on "their side." Because of these credibility problems, evaluators with 
ambiguous forma ti ve-summati ve job descriptions have to arrive at a 
determination of their primary audience through negotiation. 



ERLC 



33 



Another determiner of how seriously the rudience listens to your 
results 1s the method you use for gathering information. Methods of data 
gathering include the evaluation design; the instruments administered; the 
people selected for testing, questioning, or observation; and the times at 
which measurements are made. The specific methods you select will depend 
on whether you and your audience favor more quantitative or more 
qualitative approaches to the evaluation. When you choose a general 
approach, select instruments and designs, or construct a sampling plan, 
remember this: You cannot count on your audience to accept as credible the 
same sorts of evidence that you consider most acceptable. People are 
usually skeptical, for Instance, of arguments they do not understand. You 
might have noticed that when reports filled with complicated data analysis 
are presented, people stay quiet until a few experts have given their 
opinions. Then everyone discusses the opinions of the experts. 

Unless your audience expects complex analyses, you should keep your 
data gathering and analyses straightforward. Think of yourself as a 
surrogate, delegated to gather and digest information that your audience 
would gather on its own, were it able. Keep a few representative members 
of the evaluation; audience in mind, and ask yourself periodically: "Will 
Mr. Carson see the value in collecting this data or in doing this 
analysis? 1 ' A good way to find out, of course, 1s to ask Mr. Carson. 

Remember, as well, that the general public tends to place more faith 
in opinions and anecdotes than do researchers — at least usually. If you 
plan to collect a large amount of hard data, you will have to educate 
people aoout what it means. 



40 



Deciding on Reporting Requirements and Style 

Why worry about reporting when you're conceptualizing your evaluation 
project? First, because each formal report require* time to write ana 
produce, reporting can have important implications for the project budget, 
particularly if different reports are tc be produced to meet the needs of 
different audiences. Formal reports, too, are only part of ..hat is 
required to hel K assure a useful project: reporting, both formal and 
infor.nal varieties, should be an on-goi'g process coring ths lif* of an 
evaluation, not just an erd of project product, 

A second reason for considering reporting requirements s*. ly is their 
influence on the methodology and impact of the evaluation, Kow reports are 
perceived by their potential audiences, the credibility of the evidence 
presented and the persuasiveness of findings and conclusions is dependent 
not only on what is presented but on how it is presrnted as well. What 
kinds of reports are desired? Preferred reporting style is applicable 
here. It ,** f ers to the relative weight a report giv^s to quantitative and 
to qualit ,,ve data aid the degree of formality with which the report is 
delivered. Do the report audiences, for example, prefer evidence in the 
form of tables of means, percentages, etc., in the form of charts and 
graphs, and/or fr :he form of characteristic anecdotes? Such preferences 
need to be articulated and negotiated early in the planning process. (The 
kit book, How to Ccmmunicate Evaluation Find.ngs , should be of help to you 
as you negotiate a reporting strategy.) 
Determining the General Evaluation Approach 

Part of what determines the credibility of an evaluation, as mentioned 
above, is the credibility of the technical approach — the credibility of 



ERIC 



41 



the design, methods, measures etc. that are utilized for answering the 
questions of interest in the evaluation. How do you choose an appropriate 
technical approach? The answer lies in the interplay between the 
evaluator's predispositions, client preferences, and most importantly the 
information needs of the evaluation. 
Quantitative Approaches 

You are probably aware that technical approaches are often 
dichotomized into two general categories: Quantitative approaches and 
qualitative approaches. Quantitative approaches have been nv)st prevalent 
historically in evaluation studies, particularly in evaluation studies 
intended to measure program effects. Quantitative approaches are concerned 
primarily with measuring a finite number pre-specified outcomes, with 
judging effects and with attributing cause by comparing the results of su^h 
measurements in various programs of interest, anu with generalizing the 
results of the measurements and the comparisons to the population as a 
whole. The emphasis is on measuring, summarizing, aggregating and 
comparing measurements, and on deriving meaning from quantitative 
analyses. (Quantitative approaches also may be used in ~ating, 
classifying, and quantifying particular pre-defined aspects of program 
operations.) Such approaches utilize experimental designs, require the use 
of control groups, and are particularly Important when program 
effectiveness is the prinary evaluation Issue 

The Importance of Design and Control Groups in Quantitative Approaches 

Why are designs and control groups so important? You probably already 
know a bit about design — that it involves assignment of students or 



eric 



2 



classrooms to programs, and to comparison or control groups. The purpose 
of this discussion is to present you with the logic underlying the need for 
good design in evaluations where you want to snow that there is a 
relationship between program activities and outcomes. 

First consider the common before and after design. In the typical 
situation, a new program has been instituted and an evaluation planned. 
The evaluator administers a pretest at the beginning of the program, and at 
the end of the program, a posttest as in the following examples: 

° A new district-wide mathematics program is evaluated. The 
California Achievement Test is administered in September and again 
in May. 

° A new halfway house program has set itself the goal of decreasing 
recidivism in its juvenile clients. The evaluator observes and 
records the number of arrests and of convictions of its clients at 
the beginning of the year and then again at the end of the year. 

° An objective of a corporate reorganization project is to increase 
staff morale and productivity. Staff fills in a questionnaire at 
the beginning of the year and then again at the end of the year; 
productivity indices likewise or computed at the beginning and end 
of the year. 

Differences on the pre and posttests are then scrutinized to determine 
whether the program did what it was supposed to do. This is where tne 
before-and-af ter design leaves the evaluation vulnerable to challenge. It 
fails to answer two important questions: 




43 



1. how good are the results? CoiPd they have been better? Would 
they have been the same if the program had not been carried out? 

2. Was it the program that brought about these results or was it 
something else 

Consider the folowing situtation. A new math program has been put 
into effect in the Lincoln District. Ms. Pryor, the superintendent, wants 
to assess the quality of the program by examining students 1 grade 
equivalent scores from a standardized math test given in September and then 
again in May. She notes that the sixth grade average was 5.4 in September 
and 6.5 in May. She attempts to judge the value of the new math program 
based on this pretest-pos ttest information. 

Results on State Math Test - 6th Grades 

Reading Program Sept. Pretest (G.E.) May Posttest (G.E.) 

Sunnydate Learning 5.4 6.5 

Associates 

The Lincoln students in the ex&mole have shown a considerable gain in 
reading from protest to posttest -1.1 grade equivalent points. On the 
other hand, they are still not performing at grade level. Therefore Ms. 
Pryor must ask herself, How good are these results? The answer depends, of 
course, on the children and the conditions in the school and home. For 
some groups, this would represent area! progress; for others, it would 
indicate serious difficulties in the program. 

How can Ms. Pryor find out what progress she should expect from her 
sixth graders? The pretest tells her something - the six+h-graders were 



ERLC 



44 



six months behind in September, and they ended up only four months behind 
in May. Perhaps without the new program they would have ended up five 
months behind. Or perhaps they would have done better with the old 
program! In order to know what difference the program made, she needs to 
know hew the students would l ave scored without the program. 

Ms. Pryor has another problem in interpreting her results. She cannot 
even show that the gains she did get on the' posttest were brought about by 
the new program. Perhaps there were other changes that occurred in the 
school or among the students this year - a drop in class size, or a larger 
number of parents volunteering to tutor, or the miraculous absence of 
"difficult" children who demanded teacher time and distracted the class. 
Many influences might cause the learning situation to alter from year to 
year. 

Ms. Pryor could have ruled out most of these explanations of her 
results by using a control group. First, two randomly formed groups would 
have been assigned at the beginning of the year to either the new math 
program or to another semester of the old one (or to another alternative 
program). Before the program began, both groups would have been 
pretested. At the end of the year, the groups would be posttested using 
the same reading test. 

Because the two groups were initially equivalent, the scores of the 
control group would show how the new program students would have scored if 
they had not received the new program: 



Program 



Pretest 



Posttest 



Sunnydate (x) 



5.4 



6.5 



Old Program 
(Control Group) 



5.4 



6.1 



But was it the new program that brought about the improvement, or was it 
some other factor? Using a true control group design, Ms. Pryor can 
discount the influence of other factors as long as these factors have 
probably also affected the control group. If, for instance, some students 
had had an enriched nursery sch' "* *ogram that got them off to a good 
start in math, the random assi9,i.,«nt should have spread these students 
fairly evenly between the two groups. If more parents were helping in the 
school, this should have benefitted both groups equally. If this year's 
sixth grade was generally quieter, with fewer difficult children, this 
should have affected both the experimental and control groups equally. 
Ms. Pryor does not even have to know what all the factors might have been. 
By randomly assigning the two cjroups, the influence of various factors 
affecting the math achievement of the two groups is likely to be 
equalized. Then differences observed in outcomes can be attributed to the 
one factor that has been made deliberately different: the reading 
program. 

Though much maligned as impractical, Lhe true control group design 
produces such credible and interpretable results that it should at least be 
considered an ideal to be approximated when evaluation studies are 
planned. 2 Tne design is valuable because it provides a comparative basis 
from which to examine the results of the program in question. It helps to 



2Actually, true control group designs nave been used in evaluation of 
many educational and social programs. A list of 141 of them, with 
references, is contained in Boruch, R.F. Bibliography: Illustrative 
randomized field experiments for program planning and evaluation. 
Evaluation , 1974, 2(1), 83-87. 

4l> 



rule out the challenges of potential skeptics that good attitudes or 
improved achievement were brought about by factors other than the program. 

It is not always easy to convince people that random assignment and 
experimentation are good things; and of course you must make decisions that 
are consistent wit!i the opinions of your audience. 

Consider using a design for planning the administration of each 
measurement instrument you will use. Consider a randomized design first. 
If this is not possible, then look for a non-equiva lent control group - 
people as much like the program group as possible but who will receive no 
program or a different program. Or try to use a time-series design as a 
basis for comparison: find relevant data about the former performance of 
program groups or of past groups in the same setting. Only if none of 
these designs is possible should you abandom using a design. An evaluation 
that can say "Compared to such-and-such, this is what the program produced" 
is more interpretable than one like Ms. Pryor's that simply reports scores 
in a vacuum. ( How to Design a Program Evaluation provides more detail on 
this subject.) 

Qualitative Approaches Also Can Be Important 

While experimental design and control groups have traditionally been 
advocated in evaluation studies, in recent years qualitative methods have 
been given increasing attention. In contrast to the traditional deductive 
approach used in quantitative approaches, qualitative methods are 
inductive. The researcher or evaluator strives to describe and understand 
the program or particular aspects of it as a whole. Rather than entering 
the study with a pre-existing set of expections or a prespecified 



ERIC 



47 



classification system for examining or insuring program outcomes (and/or 
processes), the evaluator tries to understand the meaning of a program and 
its outcomes from the participants 1 perspectives, /he emphasis is on 
detailed description and on in-depth understanding as it emerges from 
direct contact and experience with the program and its participants. Using 
more naturalistic methods of gathering data, qualitative methods rely on 
observations, interviews, case studies and other means of fieldwork. ( How 
to Use Qualitative Methods provides more detail about qualitative 
approaches. ) 

Traditionally, qualitative and quantitative approaches nave been seen 
as diametrically opposed, and many evaluators still strongly espouse one 
approach or the other. More recently, however, this view is beginning to 
change, and more and more evaluators are beginning to see the merits of 
combining both approaches in response to differing requirements within an 
evaluation and in response to different evaluation contexts. For example, 
if the purpose of an evaluation is to determine program effectiveness and 
the program and its outcomes are well defined, then a quantitative approach 
is appropriate. If, on the other hand, the purpose of an evaluation 1s to 
determine proe am effectiveness, but the program and its outcomes are 
ill-defined, the evaluator might start with a qualitative approach to 
identify critical program features and potential outcomes and then use a 
quantitative approach to assess their attainment. To take another, 
different example, suppose the purpose of an evaluation is program 
improvement, and more particularly to identify promising practices that 
might be updated in a number of program sites. An evaluator might use a 



48 



quantitative approach to identify sites which were particularly successful 
in achieving program outcomes and then use a qualitative approach to 
understand how the successful sites were different from those with less 
success and to identify those practices which were related to their 
success. 

There is no single correct approach, then, to all evaluation 
problems. Some require a quantitative approach; some require a qualitative 
approach; many can derive considerable benefit from a combination of the 
two. 

How Does An Evaluator Decide What To Measure or Observe ? 

Having decided on a general approach, an evaluator might decide to 
measure, observe and/or analyze an infinite number of things: smiles per 
second, math achievement, time scheduled for reading, district 
innovativeness, sick days taken, self-concept, leadership, morale and on 
and on. 

Carrying out an evaluation in any area often is a matter of collecting 
evidence to demonstrate the effects of a program or one of its 
subcomponents and/or to help improve it. The program's objectives, your 
role, and the audience's motives will help you to make gross decisions 
about what to look at. Four general aspects of a program might be examined 
as part of your evaluation: 

° Context characteristics 

0 Student or Client characteristics 

° . racteristics of program implementation 

° Program outcomes 

p Program costs 



49 



Context Characteristics 

Programs take place within a setting or context - a framework of 
constraints within which a program must operate. They might include such 
things as cl^ss size, style of leadership in the school district 
organization, time frame within which the program must operate, or budget. 
It is especially important to get accurate information about aspects of the 
cor^ext that you suspect might affect the success of the program. If, for 
example, you suspect that programs like the one you are evaluating might be 
effective under one style of governance but not under another kind, you 
should try to assess leadership style at the various sites to explore that 
possibility. 
Client Characteristics 

Personal characteristics include such things as age, sex, 
socioeconomic status, language dominance, ability, attendance record, and 
attitudes. It may sometimes be important tc see if a program shows 
different effects with different groups of clients. For example, if 
teachers say the least well-behaved students seem to Ike the program but 
the best behaved students do not like it, you would want to collect ratings 
of "wel 1-behavedness" prior to the program and examine your results to 
detect whether these different reactions did indeed occur. 
Characteristics of Program Implementation 

Program characteristics are, of course, its principal materials, 
activities, and administrative arrangements involved in the program. 
Program characteristics are the things people do to try to achieve the 



50 



program's goals. You will almost certainly need to describe these; though 
most programs have so many facets, you will have to narrow your focus to 
those that seem most important as most in need of attention. In summative 
evaluations, these will usually be the characteristics that distinguish the 
program from other similar ones. 
Program Outcomes 

You often will want to measure the extent to which goals have been 
achieved. You must make sure, however, that all the program's important 
objectives have been articulated. Be alert to detecting unspoken goals 
such as the one buried in this comment: "I could see how much the audience 
enjoyed the program. This alone convinced me the program was good." At 
least in the eyes of the person who said this, enjoyment was a program 
goal, or a highly valued outcome, whether or not this was so stated in 
program plans. You also need to ask whether outcomes are immediately 
measurable. Some hoped for outcomes may be so long-range that only a study 
of many years 1 duration could establish that they had occurred. This would 
De the case, for example, with goals such as "increased job satisfaction in 
adult life" or "a life-long love of books." 

The evaluator should in general focus the evaluation on announced 
goals, but should be careful to include the possible wishes of the 
program's larger constituency - for example, the community - in formulating 
the yardsticks against which the program will be held accountable. 
Program Costs 
(insert to come) 



ERLC 



51 



Rules of Thumb 

Beyond these general guidelines, decisions about exactly what 
information to collect will be situation-specific. Every program has 
distinctive goals; and every situation makes available unique kinds of 
data. Though there 1s no simple way to decide what specific Information to 
collect, or what variables to look at, there are some rules of thumb you 
can follow: 

1. Focus data collection where you are most likely to uncover program 
effects 1f any occur. 

2. Try to collect a variety of information. 

3. Try to think of clever - and credible - ways to detect achievement 
of program objectives. 

4. Collect Information to show that the program at least has done no 
harm. 

5. "easure what you think members of che audience will look for when 
they receive your report. 

6. Try to measure things that will advance the development of 
educational theory. 

Use of each of these pointers 1s discussed below. 

Focus Data Collection Where You Are Most Likely 
To Uncover Program Effects If Any Occur 

While it is important that the evaluation 1n some way take note of 

major but perhaps ambitious or distant goals, do not place major emphasis 

upon them when deciding what to measure. One way to decide how to focus 



the evaluation is to classify program goals according to the time frame in 
which they can be expected to be achieved. Any particular intervention or 
program is more likely to demonstrate i detectable effect on close-in 
outcomes rather than those either logically or temporal!' remote. This 
means that you will also reduce the possibility of the program's showing 
effects if you focus on outcomes whose attainment is likely to be hampered 
by uncontrolled features of the situation. You should look for the 
program's effects close to the time of their teaching, and you should 
measure objectives that the program as implemented seeks to achieve. 

Consider, for example, a hypothetical situation in which an employee 
training program has been designed with the objective of increasing the 
communication skills of employees working in programs for inner-city 
clients. The program was instituted in order to eventually accomplish 
these primary goals: 

0 To decrease employee absenteeism and early retirement because of 
high pressure on the job 

0 To encourage congenial interpersonal relationships among employees 
and clients 

0 To decrease the number of employee disciplinary referrals 
In evaluating this program, you could measure the amount of employee 
absences and the number of hostile employee client encounters occurring 
before, during, and after the program; and the number of employees sent for 
disciplinary action. These, after all, are measures reflecting the 
program's impact on its major objectives. 



ERLC 



53 



There 1s a problem with basing the evaluation solely on these 
objectives, however. Judgments of the quality of the program will then be 
based only on the program's apparent effect on these outcorr.es. While these 
are the major outcomes of Interest, they are remote effects likely to come 
about through a long chain of events which the employee training program 
has only begun. A better evaluation would include attention to whether 
employees learned anything from the training program Itself or whether they 
displayed the behaviors the training was designed to produce. 

In general, since there are various ways In whJch a program can affect 
its participants, one of the evaluator's most valuable contributions might 
be to determine at what level the program has had an effect. Think of a 
program as potentially affecting people In three different ways: 

1. At minimum, it can make members of the target group aware that its 
services are available. Prospective participants can simply learn 
that the program is taking place and that an effort Is being made 
to address their needs. In some situations, demonstrating that 
the target audi once has been Informed that the program is 
accessible to them might be important. This will be the case 
particularly with programs that rely on voluntary enrollment, such 
as life-long learning programs, a veneral disease education 
program, or community outreach programs for seniors, juveniles, 
etc. Evaluation of these kinds of programs will require a check 
on the quality of their publicity. 

2. A prog r am can Impart useful Information , It might be the case 
that a program's most valuable outcome Is the conveyance of 



54 



information to some group. Learning, of course, is the major 
objective to most educational programs. Although most programs 
aim toward \ -icr goals than just the receipt of information, 
attention should not be diverted from assessing whether its le:s 
ambitious effects occurred, in the emp^'ee training examplo, fo* 
instance, it would be important to sLow that employees have become 
mere aware 0" the problems and life experiences of minority 
clients. If you are uriab,^ to show an impact on their behavior, 
you can at lea?t show that the program has taught them something, 
3, A program c a r actually influence changes in behavior . The most 
difficult evaluation to undertake is One that looks for the 
influence of a program on people's day tc day behavior. While 
beha, Jr and attitude change are at the top of the list of many 
program objectives, actually determining whether such changes have 
ocrjrred of tan requires more effort than tne evaluate: can 
muster. You will, of course, be interested in at least keeping 
tabs on whether the program is achieving some of its grander 
goals. Consider yourself warned, however, that the probability of 
a program showing a powerful behavioral effect might be minimal. 
Try To Collect a Variety of Information 

Three good strategies will help you do thir First, try to find 
useful informat. jn which is going to be collected anyhow. ind out which 
tests are given os part of the program or routinely in the setting; look at 
the teachers 1 plans fc assessment; look at records from the program or at 
reports, journals, and logs which are to be kept. Check to see whether 



evidence of the achievement of some of the program's objectives can be 
inferred from these* 

Another good way to increase the amount o f information you collect is 
by finding someone to collect information for you. You mighr persuade 
teachers to establish record keeping systems that will benefit both your 
evaluation and their instruction. You might hire someone such as a student 
from a local high school or college to collect information* Perhaps you 
can even persuade a graduate student seeking a topic for a research study 
to choose one whose data collection will coincide with your evaluation. 

Finally, a good way to increase the kinds of information you can 
collect is to us3 sampling procedures* They will cut down the time you 
must spend administering and interpreting any one measure* Choosing 
representative sites, evjnts, or participants on which to focus, or 
randomly sampling groups for teiting, will usually produce information as 
credible to you/ audiences as if you had looked at the entire population of 
people served. 

Collecting a variety of Information gives you the advantage of 
presenting a thorough look at the program. It also gives you a good chanc3 
of finding indicators of significant program effects and of collecting 
evidence to corroborate some of your shakier findings. 

Besides accumulating a breadth of information about the program, you 
might decide to conduct case studies to lend your picture of the program 
greater depth and complexity. The case study evaluator, interested in the 
broad range of events and relationships which affect participants in the 
program, chooses to examine closely a particular case - that is, a school, 



a classroom, a particular group, or even an individual. This method 

enables you to present the proportionate influence of the program among the 

myriad other factors influencing the actions and feelings of the people 

under study. Case studies will give your audience a strong notion of the 

flavor of the activities which constituted the program and the way in which 

these activities fit Into the dally experiences of participants, 

Jrj io Think of Clever - and Credible - Ways 
To Detect Achievement of Program Objectives 

Suppose 1n the teacher In-service example discussed earlier, it turns 
out that teacher absenteeism has remained unchanged and that the number of 
disciplinary referrals has diminished only slightly. These findings make 
the program look Ineffective. 

It might be the case, however, that though teachers have continued to 
send students -o the office, ley are discussing problems more often among 
themselves, reading more about minority groups, and talking more often with 
parents. Perhaps the content of referral slips has changed. Rather than 
noting a student's offense by a curt remark, maybe teachers are now sending 
diagnostic and suggestive Information to the school office, 

A little thought to the more mundane ways in which the program might 
affect participants could lead you to collect key Information about program 
effects, A good way to uncover nonobvious but Important indicators of 
program Impact Is to ask participants during the course of the evaluation 
about changes they have seen occurring. Where an informal report uncovers 
an intriguing effect, check the generality of this person's perception by 
means of a quick questionalrr or a test to a sample of students. You 
should, incidentally, try to keep a little money in the evaluation budget 
to finance such ad hoc data gathering. 

ERIC d ' 



Collect Information To Show That The Program 
At Least Ha ^ Done No Harm * 

In deciding what to measure, keep in mind the possible objections of 
skeptics or of the program's critics. A common objection is that the time 
spent taking part in the program might have been better spent pursuing 
another activity. Sometimes the evaluation of a program, therefore, will 
need to attend to the issue of whether students or participants, by 
spending time in the program, may have misled some other important 
educational experience. This is likely to be the case with programs which 
remove students from the usual learning environment to take part in special 
activities. "Pull-out" programs of this kinJ are often directed toward 
students with special needs - either enrichment or remediation. You may 
need tc show, for instance, that students who take part in a speech therpy 
program drring reading time, have not suffered in their reading 
achievement. Similarly, you may need to show that an accelerated junior 
high school science program has not actually prevented students from 
learning science concepts usually taught at this level. 

Related to the problem of demonstrating that students have not missed 
opportunities for learning is the requirement that ycu also show the 
program did no actual harm. For instance, attitude programs aimed at human 
relations skills or people's self-perceptions could conceivably go awry and 
provoke neuroses. Where your audience is likey to express concern about 
these matters, you should anticipate the concern by looking for these 
effects yourself. 



9 

ERIC 



58 



Measure What You Think Members of the Audience 
Wi ll Look For When They Receive Your Report 

Try to get to know the audience who will recieve your evaluation 

information. Find out wnat they most want to know. Are they, for 

instance, more concerned about the proper implementation of the program 

than about its outcomes? A parent advisory group r for instance, might wish 

to see an open classroom functioning in the school. They may be more 

concerned with the installation of the program than with student 

achievement, at least during the firs*, year of operation. In this case, 

your evaluation should pay more attention to measures of program 

implementation than to outcomes although progress reports will be 

appropriate as well. If you get .3 know your audience, you will realize 

that, for instance, Mr. Johnson on the school board always wants to know 

about integration or interpersonal understanding; or the foundation that 

supplied funding is mainly concerned with potential job skills. Visualize 

members of the audience reading or hearing your report; try to put yourself 

in their place. Think of the questions you would ask the evaluator if you 

were they. 

Delineating What You Can Accomplish 
Within Budget and Other Constraints 

Financial limitations and political climate represent important 

constraints on an evaluation, potentially "limiting the scope and depth of 

its investigations. The amount of time an evalu**or can devote, 

limitations on who , where , and when he can measure? or observe, and 

constraints on what he can ask al 1 determine the ultimate breadth and 

quality 0 f an evaluation. 



The amount of time an evaluator can devote to the project is dependent 
on the available budget. Available time, in turn, significantly influences 
methodological choices. Site visits, for example, are costly in terms of 
staff time as well as travel. Special outcome measures, as another 
example, requires staff time for development, pilot-testing and analysis. 
Assessing more rather than fewer program participants as a third example, 
has significant cost implications. Rarely are abundant resources available 
for an evaluation, and the evaluator often must juggle artfully to maintain 
a reasonable balance between the demands of scientific rigor ana 
credibility and those of the budget. (Sometimes such a balance is just not 
possible and clients need to be in, armed accordingly.) 

But financial resources represent only a part of the constraints on 
any evaluation. Some writers have expressed pessimism about the usefulness 
of evaluat 5 '- 1 results because of the overriding social and political 
motives of t. ieople who are supposed to use evaluation results for making 
decisions, Ross and Cronbach* describe the situation this way: 

Far from rupplying facts and figures to an economic man, 
the evaluator is furnishing arms to a combatant in a war 
with fluid lines of battle and transient alliances; whoever 
can use the evaluators to gain an inch of terrain can be 
expected to do so. ..The commissioning of an evaluation. ..is 
rarely the product of the inquiring scientific spirit; more 
often it is the expression of political forces. 



iRoss, L. & Cronbach, L.J. "Review of the Handbook of Evaluation 
Research." Educational Researcher , 1976 5U0), 9-19. 



GO 



The political situation could hamper an evaluation 1n several ways. 
For one, it might place constraints on data collection that make accurate 
description of the program Impossible. The sponsor could, for Instance, 
restrict the choice of sites for data collection, regulate the use of 
certain designs or tests, or withhold key information. Politics could, as 
well, cause the evaluator's report to be ignored or his results to be 
its interpreted in support of someone's point of view. 

Responding to any of these situations will depend on vigilance in each 
unique case. Remember that your major responsibility as an evaluator is to 
collect good Information wherever possible. 

How might an evaluator alleviate some of these political forces? 
First remember the old adage "Forewarned 1s forarmed" and be aware of the 
political forces at work 1n your situation. Second, try to neutralize the 
influence of competing agendas by drawing the represents t1 ves of powerful 
constituencies Into the evaluation process. Identify the relevant decision 
makers and information users and work with them to Identify the program 
needs and to focus the evaluation. On this point; The Standards for 
Evaluations of Educational Programs, Projects, and Materials developed by 
the Joint Comml ttee on Standards for Educational Evaluation (1981) states: 
The evaluation should be planned and conducted with anticipation of 
the different positions of various interest groups, so that their 
cooperation may be obtained, and so that possible attempts by any of 
these groups to curtail evaluation operations or to bias or misapply 
the results can be averted or counteracted. 



ERJC si 



Because of the acknowledged political nature of the evaluation process 
and the political climate in which it is conducted and used, it is 
imperative that you as the evaluator examine the circumstances of every 
evaluation situation and decide whether conforming to the press of the 
political context will violate your own ethics. It could turn out that the 
data that audiences want, or the kinds of reports required, do not suit 
your own talents or standards, or the standards of the profession. 

The point is that all evaluations operate within a set of constraints 
— financial, political, and others that influence both what an evaluation 
can accomplish and its potential impact. The evaluator needs to be aware 
of these various constraints and to plan accordingly for the most effective 
evaluation possible. 



62 



CHAF TE C 2 
HOW TO F'LA i THE ROLE OP EVALUATOR 



It 12, 5 r 3 re s-.aiaatior. that does not have both -formative 
3-. 3 Eumffisti.s :haractensticF. As the demand tor utility 
in e.<sl_ation has increased over the /ears, information derived 
f-orr summaries e /ai uati ons is often used for some ascect of 
program imorpvement or renewal. In the ideal, a summati /■ 
e.alusnon is designed primarily to assess the overall impact of 
a de.eiocec program so that deci son-mah ers might decermme the 
•-.tuT.ate fate a* a drogram, and a formative evaluation is 
concuctea wmls a orogram is still being installed so that it mav 
-e imciemented as effect! -.el v as possible. In practice, both 
t/c^c c ; evaluations oass ihroL.gn the same basic steDs, aitnough 
tre-s ms. rise be cif-ferences in timing, m auaiences. anc in the 
reldtibrship cetween the evaluator and the program under 
. - .-est x gati on. 

This cr.aoter presents a description of the man-, racets or an 
s rele and outlines the resDonsi bi 1 i ti ec and activities 
2 with the position. Since the tasts of an e/aiuator 
rhange witn tne content, it is i napproor i ate tc prescnoe wnat 
this person must do. Father the chapter will descrioe the r^l- 
wit- rega-d to the orogram and suggest some of the activities 
in which an evaluator might become involved. In this Handbook 
tne sets a* tests which evaluators aim tc accomplish are called 
eSs. n 9..ss' ""hese are: 

* Agenda A: Set the Doundarv of the evaluation. 

* Amende B: Select appropriate evaluation designs and 



at; 



BEST COPY 



measurements*. 



* Agenda C: Collect information and anal /ze data. 

* Agenda D: Report and canter with the orxmar\ sua: ence < s) - 



You might think of each agenda as a set of information- 
gathering activities culminating in one or more meetings 
where decisions are made about the next information- 
gathering cycle. Although it would be logical to perform 
the tasks subsumed under each agenda m the sequence 
presented above, you will likely find yourself working at 
two or more agendas simultaneously or cycling back 
through them again and again. This will be particularly the 
case with Agendas C and D; yon might collect, report, and 
discuss implementation and progress data many times 
during the course ol the evaluation. Meetings with stall* are 
also likely to involve more than one agenda. 



This chapter discusses in detail /arxous ways to accomplish 
each o* these tas^s. Chapter Three then presents step-c.-Etep 
guides -For completing each of the act: r ties outlined Iisre. 



E2£u§ of the Formative ^valuator 



Whatever their situation. lormative evaiuators do shaic a 
set ot common goals. Their major aim, of course, is to 
ensure that the program be implemented as effectively as 
possible. Ths formative evaiuator watches over the pro- 
gram, alert both for problems and lor good ideas that can 
be shared. The goal of bringing about modifications fo 
program's improvement carries with it four subgoais: 

• To determine, in company with program planners and 
staff, vhat sorts of information about the program 
will be collected and shared and what decisions will 
be based on tiiis information 

• To assure that the program's goals and objectives, and 
the major characteristics of its implementation, have 
been well thought out and carefully recorded 

• To collect data at program sites about what the 
program looks like in operation and about the pro- 
gram's effects on attitudes and achievement 

• To report this information cleat ly and to help the 
staff plan related program modifications 



^2 



64 



ERIC 



BEST COPY 



Agenda A: Set the Boundaries of the Evaluation 

Your fin>t job will he to delineate the sdjpc of the evalua- 
tion by sketching out with ih^?f^rty^^dc$cnpuon 
of what your tasks will he Tins first plan mid:' result in a 
contract describing wha' you will do fa» Uw as well as 
the responsibility they will assume to help you gather 
information, and to act upon what you report. 

As soon as you have hung up your telephone, having 
spoken with someone requesting vtfemmm evaluation, 
you are confronted with Agenda A. It could be that the 
person asked you outright to help with project improve- 
mem. Or perhaps you chose to focus on providing forma- 
tive inlorm^ion to the project after having been given the 
job of summative evaluaior or an ambiguous evaluation 
role. It could be as well that you started ou; as one of the 
program's planners and that your roie as foi mauve cvalu- 
uior is simply a "change of hats," a shift from > - pre- 
vious responsibilities. 



ERLC Go 



Research the Program. You should find out as much as 
possible about the orogram be-Fore meeting with your audience, 
crogram stat- and Planners in the case oi a -formative evaluation 
anc orimarv decision-makers in the case o-F a purely summative 
evaluation. A recommended -first activity is to contact someone 
who is tamiliar with this or similar programs. In addition 



to sharing basic information, he may be able to help you 
anticipate problems with the evaluation or with developing 
a good relationship with« t a ff. a» L d *^HU+f+. 

By ail means ask the program planners for documents 
related to funding and development or adoption of the 
program. These documents might include an RFP (Request 
for Proposals) issued by the funding source when it first 
olfcred money for such programs, the program plan or 
proposal, and program descriptions wnltcn for other rea- 
sons such as public relations. Use these documents to form 
an initial general understanding of what the program is 
supposed to look like, what its goals might be, and particu- 
larly, what shape the evaluation might take. 

In addition, it may be worth your while to quickly 
check the educational literature to see what, if anything, 
has recently been written about programs like the one in 
question, o' aboutjpecifip CorpMnentSr^a^cpmcwrdal 
curriculum ™< ^J ffitf^fl^^ 

even find earlier evaluations of this or similar programs. 



BEST 



Encourage £9.551 rati on. For whatever reason vou. unoertaJ e 
the e-.aluation. the urst step will be to establish a worung 
-elationshic with the clients. Since formative evaluation 
depends on sharing information informally, one of the outcomes of 
Agenda A m this tvoe of evaluation should be the establishment 
of oroundwor. for a trusting relationship W1 th the staff and 
planners. 1-f /OLir evaluation has been commissioned by the prog- 
ram staff itself, establishing trust will be easier than if, 3S 
in mar.v summative evaluations, the contractor is an external 
agenc, oe-haos the state or federal government. In the latter 
c£? = e. tne e.eiuator starts out as an outsider. 

It is .-sr. important to avoid ending up in an adversary 
position against a defensive program staff. This is particularly 
important 1P a formative evaluation, and your posture toward the 
= t?~f and the program will differ somewhat from that tal en in a 
summat. ,e evaluation. A formative evaluator will need to con- 
vince the staf-" that her primary allegiance is to help them 
discover now to optimize program implementation and outcomes. 
Whereas, it is clear in a summative evaluation that vour main 
responsibility is to provide an unbiased report of program 
ac omplishments to primary decision-mat ers, and this 
responsibility should oe made quite explicit. 

In craer to develop the necessary trust in a formative 
evaluation, vou might describe the form that your outside 
reporting will tale and allow the staff a chance to review vour 
external reports. Whenever it seems necessarv, you might also 
guarantee that information shared for the purpose of internal 



program review and improvement will be kept confidential . An 
important way to gain the confidence of program personnel 13 to 
mate yourself useful fron the very beginning, efficiently 
collecting information they need or woui J 1 1 J- e to have. 

£LlE±.£ iQiSCCHstlon from Staff. While mutual trust must be 
worked out gradually, some more practical aspects of vour role can 
be negotiated during a single meeting. Agenda A requires that 
you and the principal contractor for the evaluation decide 
together what 1 ou will do for them. 

Again, an evaluation done for formative purposes requires 
a close working relationship with the staff as you jointly 
determine the most appropriate course for your efforts. If yuu 
arrive early during program development, you . ay find that: the 
staff needs help in identifying program goals and choosing 
related materials, activities, etc. Even after the program has 
begun, thev may stiM be planning. Regardless cf the state of 

proaram development ,„i, M . . 

when you begin, you snouid help the staff outline what 
tliey consider to be the primary characteristics of the pro- 
gram, highlighting those which they consider fixed and 
those winch they consider changeable enough to be the 
focus of formative evaluation. 

It is important to get a clear picture of the attitudes of pm^-h^tfldM^ 
lefldwt and planners, particularly concerning their com- * 
mument to change, that is, the extent to which they are 
willing to use the information you collect to make modifi- 
cations m the program. Though neither vou nor they will 
be ahle to anticipate beforehand precisely what actions will 
follow upon the information you report, you should get 
some idea of the extent to which the staff is willing to alter 
the program. ^ 

In general, laying the groundwork for ymsr- formative 
evaluation means asking the planners and staff such ques- 
tions as: 

• Which parts of the program do you consider its most 
d stinctive characteristics, those that make it unique 
among programs of its kind? 

• Which aspects of the program do you think wield 
gieatcst influence in producing the attitudes or 
achievement the program is supposed to brin« about? 

l*k G7 BEST CO?y 



• What components would you like the program to 
have which it does not contain currently? Might we 
try some of these on a temporary basis? 

• Wluch parts of the program as it iooks currently are 
most troublesor.se, most controversial, or most in 

Jfriecd of vigilant attention? 
* • On what are you most and least willing, or con- 
strained, to spend additional money? Would you be 
willing or couid you, f or instance, purchase another 
mathematics series? Can you hire additional per- 
sonnel or consultants? 



you, for instance, 'emovc personnel? If mi AiuW 
ritual Itirnhfl equipment were found to be ineffec* 
tive, would you eliminate it? Whic h b e oi » v materials^" 
and other program components would you be willing 
to delete? Would you be willing to scrap the program 
as it currently looks and start ovet ? 

• How much administrative or staff reorganization will 
the situation tolerate? Can you change people's roles? 
Can you add to staff, say, by bringing in volunteers? 
Can you move people ■ twlt e w; «* en Jturfmtstj rom 
location to location permanently or temporarily? Can 
yea reassign students to different programs or 
groups? 

• How much i iutweti snal a n d cyrrif a nl i ^ change will 
you tolerate in the program beyond its current state? 
Would you be willing to delete, add, or alter the 
program's objectives? To what extent would you be 
willing to change books, materials, and other program 
components? Are you willing to rewrite lessons? 

The objective behind asking these questions is not to 
record a detailed description of the program. This will be 
done under Agenda B. Rather, the purpose is to uncover 
particularly maleable aspects of the program. The best way 





68 



ERIC 



to find out aboui (he staffs commitment to change is i<» 
ask these hard questions early. A dedicated staff that ha* 
worked diligently to plan the program will likely have in 
mind a point beyond which it will not go in making 
modifications. You *iould locate that point, and choose 
the program features you will monitor accordingly. 

Another important consideration in uncovering staff loy 
altier and attitudes is their commitment to a particular 
philosophy* nf Hkmtm^ , I f ihcy are adopting a canned 
program, this philosophy probably motivated their choice. 
Suff members developing a program from scratch may also 
subscribe to a single motivating philosophy. However you 
may find it poorly articulated jt even uncicarly evidenced 
in the prograi in this case, you c^i create a basis fo' 
future decision-making by helping the staff to clarify and 
put into practice what their philoscphy says. 

If " ju can help the staff outline areas of the program 
where modifications are likely to be either necessary or 
possible, then they can begin to delin. e the parts of the 
program whose effectiveness should dc scrutinized. Tins 
will, in turn, suggest the kinds of information they will 
need. * f the prof-am is based on canned auricula which 
will simply be installed, or on materials not expressly de- 
signed for the type of program in question, then what you 
can chance will be restricted. In this case you should focus 
on how best to make materials or orocedures fit the con* 
text. 



Example. A group o*' language arts teachers in a large hifh school 
dccidcJ »hai an alarming number of ninth-grade students were 
unable lo read at a level sufficient 10 aprnviaic the litciary 
conicni ot their curses. They decided to institute a tutoring 
p.v.pram in which twelfth grader? would spend ihrcc forty-five 
mu.atc periods a week reading literary selections with ninth 
graders. The aim of the project was to improve the nir. s — 'dcrs* 
reading as well as lo introduce them to English literature. The 
reading selections used for this program were from Pathways to 
En,>sh, a popular ninth-grade anthology: the district budget did 
not include lunds to purchase reading materials for secondary 
students. 

The i.hool's assistant principal. Al Washington, monitored 
closely the progress of the tutoring program. Alter only three 
months had pancd. he noticed torn* disappointment among the 
imiulb enthusiastic teachers. The., informal assessments of the 
rcadng of tulcrs had convinced them thai !'"ic progress had 
been »..^dc. Although the students enjoyc;! ihc tutoring experi- 
ence, ihrv were nut learning to read. Tin- leathers asked Mr. 
Washington lo evaluate the program with an eye toward su$, st* 
ing rhanges in the materials or the tutotinc arrangements that 
would help the ninth graders with their reading. Mr. Washington 
carctullv examined the Pathways to Enflhh text, observed tutor- 
ing sessions, and interviewed tutors and tutccs. I* ram this inlor- 
nuiiiMi he drew »\rec conclusions about the program and offered 
suggestions for remedial action (I) the vocabulary in Pathwvs 
to Entf.sh was too difficult for the ninth graders, so ca^ unit 
should be precedes by a vocabular> drill using a standard proce- 
dure thai vould be taught to tutors; (2) ninth graders wctc 
listening more than reading, sc tutoring sessions should be re- 
structured to follow a "you read to me and I read to you" 
formal m which luclftli and ninil. graders alternate reading 
passages. i3> uc program as constituted gave ninth graders no 
feedback about tneir progress in cither reading c literary appre- 
ciation therefore, the teachers should write short unit tests in 
vocabulary, comprehension, and appreciation. 



In the case where a wholly new program is being devel- 
oped, you will want to identify the Most promising sorts of 
modifications that can be m_de within existing budget 
limitations. You may find it most useful in concent-ate on 
helping the staff select from among several alternatives the 
most popular or effective foim the piogram can take. 



Example. KDKC, an educaUorul television staoon serving a large 
tit), received a contract fronj the federal governmen* to produce 
13 segments of a series about iiiterculturrl understanding di- 
rected at middle gride stuOnts. The objective of the series 
would be to promote appreciation of diverse cultures by depict- 
ing life in die home countries of the major cultural groups 
comprising the population of the United States. 

The producers of the senes set out at once to assemble the 
program* baaed on :he format cf popular primary g.adc pio- 
grams: the central characters Irvtej in a culturailv diverse neigh- 
bornood converse with each other about their respective back- 
grounds. These conversations lead into vignettes-fJmed anu 
animated-depicting life and culture in different countries. Some 
members of the production staff, however, sucgi ih.it a 
program format suitable for the primary grades m Somb" 
with older students. "How do wc know," they ask* "v-nat 
interests If> and 11-year oldi?" They suggested two formats 
which might be more effective: a fast acUon adventure sp> sto*> 
with documentary interludes and a dramatic nrjgram tot using 
on teen-age students t ravelin in different countries. 

To test these intriguing .unions, the producer called on Dr. 
Schwartz, a professor of Child Development. Dr Schwartz h»w 
ever, had to admit that he w;s not sure what would most intcicM 
middle grade students c* r. Since the federal grant included 
funds for planning, Dr. .iwartz suggested that the prod, er 
assemble three pilot shows presenting basically the same kite I- 
edgc via each of the three major formats being umsiaercd and 
then show tht<c to students in the target age croup. a<vs»mc 
what they learned and thctr enjoyment. The producer liked *iie 
idea of letting an experiment determine the f<"tn of the pr<>- 
grams and agreed to allow Dr Schwartz to conduct the studies, 
serving as a formative evaluator. 



HWHWaWiBMftBWeyj ^ 

and cannot do for them within i'ic constraints of yottr 
ibilitics, time, and budget. You should let them know the 
sorts of choices you will have to make based on st:iff 
preferences and likely future circumstances. It is also desir- 
able that you frankly discuss both your arca^ of greatest 
competence and those m winch you lack expertise The 
staff should know in what ways you believe nm t;in he of 
most benefit t»» the program as well as lit;* the piogram 
might profit from tlr services of a consultant who could 
handle maticrs outside your competence, 

Although you should have an evaluation plan in h.uid 
before you meet with staff and planners, let vutir audience 
have the opportunity t0 select from anting several options; 
present your preferences as recommendations, and nego- 
tiate the general form your evaluation services will take. 
Try not to become enmeshed in details too early. You „ecvl 
only agree initially on cn outline of youi evaluation respon- 
sibilities. As the piogram develops, these plans could easily 



69 BEST COPY 



change. When describing the service you might perform, list 
the kinds ufqucsttons you will try to answer about tm^ 
fiwiiTpftfjjf^s, effective use of materials, proper and 
timelv implementation of activities, adequacy of adminis- 
trative procedures, -nd c.angcs u. attitudes. Describe, as 
well, the supporting data you will gather to back up depic- 
tions vf program events and outcomes. 

If yju feel th&t the situation will accommodate the use 
of a particular evaluation design, then propose 'n and de- 

A /crm^^ h M^V^ TQ ' M lhc intcrpretabiiity of data. In 
Q «*ySnvncre yaiPnotc controversy over the inclusion of a 
program component, or where there exists a set of w alrus 
4«mm1 alternatives without a persuasive rea*,n to favor any 
one of them, suggest pilot studies based on planned varia- 
tion. These studies, wluch could last just a few weeks, 
would introduce competing variations in the program at 
different^ sites. To help the planners eventually choose 
among them, you would check their ease of installation, 

ife&JSlS Uve effect on ** 4Ui,t achievement, and staff and 
gwfcgft attsfactton. 



Example. Hie airriculum office of a middle-sized school district 
had purchased an individualized math concepts program for the 
primary grades. The program materials for teaching sets, count- 
ing, numeration, and place value consisted of worksheet and 
workbooks and sets cf blocks and cards. Curriculum developers 
familiar with Ihe literature in early childhood education were 
concerned about the adequacy of the "manipulalivcs" for con- 
veying important basic math concept*. They wondered, as well, 
whether the materials would maintain the interest of young 
children. To find out whether supplementary materials should he 
yscd, the Director of Curriculum srt up a pilot test. She pur- 
chased some Montessori counting beads and cmscnaire rods trom 
a commercial distributer and contacted a «roup ot interested 
teachers tu wratc supplementary lessons for using the beads and 
rods. 

When the program began in September, most of the disinct's 
schools used the new program without supplementary materials. 
Randomly selected schools were assigne 4 to receive 'he teacher- 
mad* lessons based on commercial manipulative!. An in-service 
workshop was held at the end of the summer to lamilian/c 
teachers in the pilot schools with the commercial materials ond 
locally made lessons. 

The Director of Curriculum periodically monitorec „ie entire 
new program, administering a math-concepts test to representa- 
tive classrooms tnree times dicing the first semester. When the-e 
tesU were administered, she took special care to include in the 
sample the classrooms using the teacher made lessons. She was 
therefore cble to use the classes without supplementary materials 
as a control group against which to measure student achieve- 
ment. Since development of mathematical concepts is difficult 
to measure in young children, she also planned u monitor 
teacher estimates of the suitability ease of installment, and 
apparent effectiveness of the various program versions. 



Planned ^,; at ;on studies for a program under develop- 
mcnt !',i,m scratch might emphasize the relative effective- 
i*ss of different matenaJs and activities. Whet** a previously 
designed program is being adapteU to a new locale, planned 
variation studies wif 1 more iikcly look at variations in staff- 
uig and program rn fc lagcment. 



er!c *'* 70 BEST COPY 



If there is enough time, suggest a balanced sez of data 
coilsction activxties. Especially in the case of a formatxve 
evaluation, include a few important pilot studies, continuous 
monitoring of program implementation, and periodic checks on 
achievement and attitudes. The precise details of these plans 
can be worked rut under Agenda D. If possible, a formative 
evaluation should include at least one service to the program 
that requires your frequent presence at program sites and staff 
meetings. This will help you stay abreast of what is happening 

and maintain rapport with -t-ho -t-ii <-> i_ 

wirn the staff. Such extensive contact with 

staff is generally not necessary for a summative evaluate 



. on . 



Arrive at a contracts* 

<<)nce you at\d your clients have reached an agreement 
about your role and activities, write it down. Tlus tentative 
scope of work statement should include: 

• A description of the evaluation questions you will 
address 

• Hie data collection you have planned, including 
sources, sites, and instruments 

• a tjmdine [q^these activities, such as the one in 
^^^pagcj jp 

• X "schedule of reports and meetings, including tenta- 
tive rgcndas where possible 

Be certain to stress the tentative nature of this outline, 
allowing for changes in the program and in the needs of the 
staff. Also, -cmember you will be responsible for all evalua- 
tion activities contracted. Exercise your option to jeerpt or 
reject 'ssignments. 

The linking agtrt role In Normative evaluation 

If you have expertise in or access to information about the 
subject areas the program addresses or if you know about 
programs of its type in operation elsewhere, you might like 
to append to your formative role an additional title, mucii 
m vogue -Unking A^cnt. A linking agent connects impor- 
tant accumulated information and resources with interested 
parties, in this case, the planners and staff of the urogram 
you arc evaluating. The linki ng agent is a on-5-person infor- 
mation retrieval system. Her sources are libraries, journals, 
booki, technical reports, and experts and service agencies of 
all kinds. 



9 

ERLC 



JZ-9 7! 



1 Bj5 ST Copy 



Different linking services will be relevant to different 
programs. For example, you might locate and describe for 
the staff sets of recently developed curriculum materials 
related lo a locally developed program. II you were evalu- 
ating a special education program, you might find and make 
use of a regional resource center offering consultation and 
diagnostic help with special education students. The role of 
linking agent wdl simply broaden the range of program 
improvement information you coilect. Be careful, however, 
that linking does not interfere with your primary job-to 
monitor ai.d desenbe the program at hand. 

Agenda §i Select Ae E rggriate Evaluation Designs and Measurements 

In Agenda A, you committed yourself to evaluation 
activities. In this agenda, the program's deci si on-ma f ers, 
staff provide a working description of the program. In a 
summative evaluation, you will collect a statement of the pro- 
gram's goals and objectives, a description of how the program 
components have been implemented, and a summary tne costs of 
the program in order to decide which outcomes, activities, ana 
costs to measure. The kit book- How to Design a Program 
Evaluation will give you careful guidance about selecting tne 
appropriate design for the evaluation. A design is a plan of 
which grouos will take part in the evaluation and when 
measurements will be made on these groups. Your design will 
might include a wide variety of measures ^uch as achievement 
assessments, attitude scales, narrative descr 1 p t 1 ons of 
observations, and cost analyses. Al l mt measures should be 
carefully selected to give information about particular outcomes. 

Eiegare a Program Statement. If a program is still under 
development, then it may be /our tasi> to have the program's 
planners and staff commit themselves to a worl-mg description! of 
their program. Th* final product of this activity should be a 
wr i tten list of 

ERIC * *--7S> 



ff vuiuati t t j tiundbirok 



Nurf t»r of personnel 
irk hours r. rained 



'AUE 2 
A Summary of che 
Formative Evaluato^'s Responsibilities 

in Month- 


•3 

3 

> 

u: 

e 

u 

o 

c 


Piogria Director I 


1/5 

c 

r. 
O 


[principals 




u 

u 
'-j 


Tjsks/Actlvities 


J ' A S 0 N D 


; *> i 

J F V A M 


Completion 
Date 


Reports and 
Deliverables 


' -Neview/rev^sion of program plan 

(— ■" 1 






JuJv 31 


Revised written plan 


37 


8 




6 




16 


1 iscussior. 40ouc method o! fonw- 
! tive feedback alternatives 

. j 


L, 




Sept 15 


None 


16 


7 




6 


- 




Planning of ltaplementat lon- 
monitorinR activities 






Stpt 30 


List of in. truments; 
Schedules of class- 
room visits 


60 


10 


24 






— 

2 


Construct! nr. of implementation 
instruments 
, 






let 10 


Completed instruments 


60 


5 


12 






16 


p lannlnp oi uni- tests 






Oct 10 


List and. schedule of 
achievement tests 


30 


5 








- 




I 

"irst mrcr.-v with srafr ' 




Nov 1 


1 

None 


9 


IS 


24 




— 

20 




: irst meet.nR with district j , 
,J-iintstr.v ..^n ^ 




Nov 8 


Firs* interim report 


22 


20 




X 


■H 




TOTAL PF.^SON HOURS 








I 





program objectives, and a rationale that dcst-tibcs the rcia- 
Uonship between ihesc objectives and the activities that arc 
supposed to produce them. The program statement should 
relied the current consensus about what comprises the 
program arrived ji with the understanding that the pro- 
lan s character ma> alter over time, 'il l l li M i l li I h i g i u p r of 
acti vit ie s in Aecndu B m r e lat ive ly re st r i c t e d, you w i ll find 
thai lhr cisr nl — 'i o t as k w i l l de pe nd h e a vily o n y ou r 
imerpsrr . unal skills ^ 

Writing a preliminary program statement- -even if only in 
outline form-is useful because it demands careful thought 
by the program's staff and planners about what they intend 
the program to look like and do. T»iis thinking alone can 
lead to program improvements. Most successful programs 
arc built upon a structured plan tha' has been clearly 
thought out and that describes as precisely as possible the 
program's activities, materials, and administrative arrange- 
ments A clear program statement encourages program suc- 
cess for several reasons. For one thing, cvc.\m>dy involved 
knows where such a program is headed and what its critical 
ch iract eristics ought to he. E\crvhody is wotking from the 
same plan If progiam variations arc taking place, then the 
staff and planncis arc likely to be aware of tins. The 
evaluator can document such differences and. where pos- 
sible. ,jss their merits. Fear of disputes among staff 
members, advisor) commit tecs, and teachers should not 
dissuade you from attempting to clanly program proce- 
dures and go isagrccmcnts durmg the planning of a 
program are by and large healthy, especially when a pro- 
gram is in its formative stage, and the * * aft should be willing 
to adopt a **wa»t and sec" .ittnude. Not all differences of 
opinion may be resolved, but tlu pooling of staff intelli- 



gence through discussion should he preferred to leaving 
each teacher to make his own guesses about what will work 
best. 



HiTwke sure that goals are well stated A ^ / ^ 

^ - TT prtxpBl*^ 

Qliere are three basic sources of information im ■pinlii If <f£JB/$i 

• Tlie program plan, proposal, and other official docu- 
ments 

• Structured interviews and informal dialogues wall 
program staff 

• Naturalistic obscrvation-bascu intuitions about pro- 
gram emphases 

As has been mentioned, it is possible that you will arrive 
on the scene and find that the program &as been too 
vaguely planned. Formative evaluation prcsuji.es the legiti- 
macy of evaluating programs whose content and processes 
arc still ucveioping. Frequently the staff will he unable to 
tell you exactly what the program should look like, and 
objectives may be tix> general to serve as a basis for moni- 
toring pupil progress. Although you should ge» a glimpse ol 
how the program will function from documents such as 'he 
program^ proposal, or the program plan, often these con- 
sist of exhaustive lists tA documented neeJs that the pro- 
gram should meet, a ,)agc or two on objectives, am! a 
description of the program's staffing and budget A descrip- 
tion of what people taking part in the program do or have 
done to their is not lo be found. 

Official documents represent Umnai atcmcnts of pro- 
gram intentions. These mav be outdated, incomplete. er ig- 
neous, or unrealistic Written descriptions of categorically 



ERLC 



73 



BEST COPY 



4 4ow to ftArv fhg fin/r uf Fnnnativr f-' valuator 



39— 



funded programs wh CSEA Ti t le \ arc particularly 
misleading, ihcir objectives ofie:i reflect orvly pv. ically 
minded rhetoric. Canned programs, or sets of published 
program materials, are another source of official objectives. 
Bui be careful here as well. While adoption of a particular 
p-ogram ma\ reflect a philosophy shared between program 
staff and ihc developer of trie materials, it is also possible 
thai t he stall" running this parucuiai program consciously or 
unconsciously posscscs a different set ol goals or that the 
program w.ll only use certain components of the purchased 
materials. 

Because of the problems associated with goals' listed in 
official documents, you arc responsible for obtaining goal 
information from discussions which probe the motives of 
ihe program staff and from observations of the program. 
Simply asking staff members their perceptions of program 
oheciives will often chcit a recitation of documented goais, 
cliches or socially desirable answers. Asking staff for see- 
punos of what you might sec or expect to see at program 
sues is sometimes more productive. These scenarios can be 
followed b\ questions about the particular learning that is 
expected to result Irom the activities described. You may 
also find it easy to elicit statements from staff members 
about which aspects of the program are free to vary and 
wluch are not. this information too can she'l light on the 
program's aims and rationale. 

Record the program's rationale 

Careful examination of ;hc rationale underlying the pro- 
gtam goes hand in hand with efforts lo base the program on 
a clear and consistent plan. The rationale on which any 
program is based, sorr" tines called a process model, is 
simply a statement of why (his program a particular set of 
implemented materials, activities, and administrative ar- 
rangements is expected to produce the desired outcomes. 
Sometimes ihc relationship between methods and goals is 
transparent, but other times, particularly with innovative 
program*, ihc credibility of the program requires that the 
staff e* nlam and justify program methods and materials. 



Example. A team of teachers from lour high schools in a large 
metropolitan art a planned a work-study program. The purpose 
ot the proitnir.i was to teach careering savvy. The teachers 
defined this j$ "knowledge about what it taker to be successful 
in one's chosen fie J of endeavor." The district assigned a consul- 
tant to the project. Anna Smith, whose job it was to help 
teachers iron out administrative details involved wi*»» coordi- 
nating student placement Ms. Smith htd also been told to serve 
in whatever formative evaluation capacity seemed necessary. 

Having discovered that the teachers did not write a proposal 
for the program, she asked that they meet with her so that she 
could wr>* i short document describing the program's major 
goah and ouili >g at lcj't the skeleton of the program. At this 
meetm*: the turners described the hask program. Students 
would ihoosc trom mang a set of community-wide jobs made 
available at minimum pav b> various professional and business 
firms The students would work as office clerics, sales persons, 
receptionists: thev might be called on to make deliveries and do 
odd jobs. 

lnstantlv Ms. Smith saw that the program was without a clear 
rationale "What makes you think.** she asked, "that students 
will gain an understanding of the important skills involved in 
carrying on a career a* a result ol their taking on menial jobs?" 



The staff had to admit that the program as planned did not 
gi 3 ran tec that students would learn about the dunes o> people 
in different r -»crs or about prerequisite skills lor success. To- 
gether with -vis. Smith they restructured the program as tollows 

• They added an observation-and-conversation component to 
ensure that sponsonng professionals and business people 
would commit some tune to describing their pcr^nal career 
historic* and would allow the students to observe the course 
of their work dav. 

• Students would be required to keep journals and read abotit 
Ihc career ol interest 



The formative cvaluator should see to it that the pro- 
gram rationale is suited to the conditions under which ihc 
program will be carried out. A r^smatch may arise, for 
example, from staff insensitivity to time needed for the 
program to produce its effects. This could be reflected in 
too many objectives or objectives that are too ambitious for 
a project of moderate duration. 

Lack of a clear program statement does not necess?rily 
mean that goals, a program rationale, and plans for activities 
do not exist. 3 Producing a program statement is most often 
a -natter of hciping the stalT and planners to coordinate 
their intentions and slripc uieir ideas. 

Production of the written statement provides a good 
opportunity for pi-uiners to describe concretely the pro- 
gram they envision. Because you have the j^b of writing an 
official statement for the staff you will be able to ask 
difficult questions without implying any criticism. In de- 
scribing goals and activities, and especially in exploring the 
logic of uV connection between them, the program staff 
may encounter contradictions, uncertainties, and conflicts 

♦ you will have to handle with tact, patience, and pcrsis- 

iCe. Their sense of ease with you in your cvaluator role 
Mil be reflected in ihe degree of candor with which ilicy 
participate in these discussions. Interviews with program 
staff and r irst-hand observations might need lo rcpLcc 
group discussions as your primary source of information if 
staff members find it too hard to articulate goals, strategies, 
and rationale in group settings. 

Work on Agenda B can proceed concurrently with work 
on Agenda A. Since you will be meeting with the program 
staff to reach agreement on your relative roles, you might 
also use these meetings to clarify program goals, describe 
implementation plans, and work out the rationa le. 

The staff, finally, 'hould keep in mind that The program k+" ' \hfl^ } 
statement you produce is a working document. You, and yftl 1 ** 
they, will update it periodically, perhaps at the end of each 
reporting interval you have agreed upon. In the meantime, 
the existing document will be useful to people interested in 
the program. Besides guiding both the program as imple- 
merited and the evaluation, it can serve as the basis of 
reports and public relations documents. 



3. If, in fact, there is a total lack of consensus concerning what 
the program is about, you miy find it neces'jry to do a retrospec- 
tive needs assessment with the staff. Needs assessments result in lists 
ol prioritized goals, determined bv polling the wants of the educa- 
tional ronsuturncy and determining how well these wants are >xing 
tilled by the cur M program. Needs assessment is discussed in 
Greater detail on page 8 



o 

ERIC 



74 



BEST "OPY 



Agenda C: My a i t m P r» giai w implementa t ion and 
tin Achitwwul o f fc o y i Objeatact 

One of the distinctive features of formative evaluation is 
the continuous description and monitoring of the program 
as it develops, including measurement of the impact it is 
having on the attitudes and achievement of its targe? 
groups The first two agendas focus on tentative agreements 
about the program's scope, rationale, planned activities, and 
goals. With Agenda C, you begin to investigate the match 
between the paper program, filled with intentions and 
plans, and the program in operation. The information you 
gainer about the program for Agenda C can be used to: 



something to be negotiated with staff and planners. Yon 
could, on the one hand, take the stance of an impartial 
conduit for transporting the information the staff feels it 
needs. At another extreme, you could be highly opinion- 
ated, calling staff attention to what you feci are the pro- 
gram's most critical and problematic processes and out- 
comes. In the former situation, your report will convey to 
program planners the data you collected with lb; post- 
script, "Now you make the decision." If you plan to 
express opinions, then your reports will likely advocate a 
course of action, and the data collected will be planned 
with an eye toward providing evidence to support your 
case. 



• Pinpoint areas of program strength and weakness 

• Refine and revise the program statement and, pos- 
sibly, your evaluation plan 

• Hypothesize about cause-effect relationships between 
program features and outcomes 

• Draw conclusions about the relative effectiveness of 
program components where you have been able to use 
good evaluation design and credible measures 

Agenda C is the phase of formation evaluation with the 
strongest research flavor. It can involve selecting samples: 
developing, trying out, selecting, administering, and scoring 
instruments; and analyzing and interpreting data. The com- 
poner books of the Program Evaluation Kit will be most 
useful for this phase of the evaluation. In addition, if you 
wil! be conducting pilot studies, or if your evaluation can 
use a randomized control group, then you ccn refer to 
Chapter 5 the Step-by-Stcp Guide For Conducting a Small 
Experiment for more precise guidance. 

In carrying out Agenda C. as with the others, you and 
the audience share information with a view toward pro* 
Jucmg a product. In this case, the product is an analysis, 
sometimes summarized in an interim report, of the pro- 
gram's implementation and its progress toward achieving its 
objectives. You tell the program staff at this point about 
the specifics of your sampling plan and site selection, and 
t h " measures you have chosen to purchase or construct in 
orucr to study features of program implementation or tc 
monitor the attitudes or achievement c different sub- 
groups. You might, in addition, desc ,e the pilot tests or 
case s'udics you have chosen to pursue. 

The first task of the program staff during Agenda J is to 
respond to your data gathering plan, suggesting adjustments 
to focus it more closely on what »ney m^st want to know. 
They should also confer with you in order to ensure that 
your measurement* will not be too intrusive on program 
activities or personnel. Finally, they shou'd share with you 
their perceptions of the credibility of the information you 
propose to collect. 

Once you have reported result to the staff and planners 
from one round of data collection, 'he audience's job will 
be, quite naturally, to carefully examine what you have said 
and choose a course of acvon. 

The degree to wiuch your own personal opinions should 
guide your data collection and reporting is, incidentally. 



ormative data collection plan a* 

eaily it would be nice if the formative evaluator could 
remain on-site with the program for extended periods of 
time, in the style of the participant-observer. Realistically, 
however, it is likely that budget, time, and possibly the 
geographical distribution of program sites, will make such 
vigilance impossible. You will have to rely on sampling, 
good rapport with the staff, and a well -designed measure- 
ment plan to give you an accurate picture ol the program 
and its effects. 

Your major source of first-hand information about the 
program will be your own informal observations and con- 
versations with staff members while on site. Their dcsuip- 
tions of the program and explanations of what you sec 
occurring should give you a good idea of how to dcMi-ti 
more formal data gathering instruments. Infoiinal observa- 
tions should also show where the program is going well and 
where it is failing, where a program component has ben 
efficiently earned out. where it is partially implemented, 
and where it is not taking place. 

In order to ensure that your informal impressions arc 
representative and accurate, more formal data gathering will 
be necessary. For the purpose of formative evaluation, 
three approaches to collecting data about the program seem 
most useful: 

• Periodic program monitoring 

• Unit testing 

• Pilot and fcasibiir 'dies 

Your choice will be primarily determined by what >ou 
want to know. 

Periodic program monitoring. The formative cvalu.nor 
who wishes to check tor proper program implementation 
throughout the evaluation selects a target set of character- 
istics wluch he then monitors periodically and al various 
sites. He also may select or construct achieve men l tests and 
attitude instruments to assess at these times the attainment 
of objectives of interest lo the staff. The sues supplying 
formative information and the times at which this informa- 
tion is collected arc often based on a sampling plan to 
ensure that the measurements made at intervals reflect the 
program as a whole. 



75 BEST copy 



1 u/ /^mwfMg r » u/aiffor 



Example. Leonard Pier son, assistant to a district's Director of 
Research and Evaluation, was asked to serve as formative eval- 
uator during the first year of a paren; education program. The 
purpose 01 the program was to train parents of preschool chil- 
dren to tutor them at home in skills related to reading readiness: 
classification of objects, concept formation, banc math and 
counting, con venation and vocabulary. Federal fusl ' had been 
provided lor the training and to purchase home workbooks 
which were supplied lree of charge. These workbooks sequenced 
and structured the home tutoring. They contained lessons, sug- 
gestions for enrichment activities, and short periodic assessment 
tests. The parent training centers were set up at six community 
agencies and schools throughout the district. Local teachers 
conducted evening classes to teach parents to use the workbooks 
daily with their children at home. 

When the project director contacted Mr. Pierson, he was 
simply asked to give whatever formative evaluation help he 
could Mr. Pierson, free to deftn? his own role, decided to focus 
on lour questionr 

• Most importantly, do students learn the skills that are empha- 
sized by the workbook? 

• To what extent do parents actually work with their child 
daily? 

• Do parents use techniques taught them in the training course t 
or do I hey develop their own? 

• Are their own techniques more or less effective than those 
thev have been trained to use? 

In order to help answer these questions, Mr. Pierson designed 
two instruments and a monitoring system for administering them 
periodically. 

• A general achievement test consisting of items sampled from 
the progress tests in the workbooks. The test will be adminis- 
tered every six weeks to a sample of participants* children. 
Presumably scores 01. this test ihould increase over time. A 
control pro up will also take tht test every six weeks 
account for learning due to sheer maturation. 

• An observation instrument to be completed by the commu- 
nity member who visits the home to give the six weekly 
achievement tests. The instrument records the amount of 
progress made in the workbooks since the last visit, the 
nature o< the teachimr methods used by the parent, and the 
apparent appropriateness of the current lesson to the stu- 
dent'* skills -that is. whether it seems to be too difficult. 
Observers will be trained to be particularly alert to changes in 
teaching style, recording both deviations and innovations. 



The details of a periodic monitoring plan ire usually 
agreed to by the evaiuator and planners at the beginning of 
the formative evaluation and then vary little throughout the 
evaluator's collaboration with the program. The periodic 
program monitor submits interim reports at the conclusion 
of each data gathering phase. Like Table 3, these often 
focus on whether the program is on schedule. 

A formative evaiuator ^an use Table 3 to report to the 
program director and the staff at each location the results 
of monthly site visits. Each interim report could include an 
updated table accompanied by explanations of why ratings 
of "U," unsatisfactory implementation, have been assigned. 
The occasion at which measurements are made are deter- 
mined by the passage of standard intervals-a month, a 
scmcstci -or by logical transition periods in the program— 
srch as the dates of completion of critical units The 
evaiuator might check time and again at the same sites or 
with the same people, or he could select a different repre- 
sentative sample to provide data at each occasion. 



TABLE 3 

Project Monitoring— Activities 



Qbiff tiwf 6 Hv r*bnnry 2*. 1«VY, each participating 
school will inplcnent, *vsluatc results, end aeke 



u, *vsluatc results, and ukt 
ruvlikma -> n proa.raai for the •atabtishewnt of • Winona School Dli.tr let 
»o»ttivtf diwtl for 1 taming Mi ley School 



Activitiira for thU oMerttv 



b i Identify muff to participate 

6.2 Selected etaff itabiri review 
ideas, gosli, and object iee* 

6 3 Identify student needs 

6.4 Identify parent need a 

6.) Identify ataff needa 

».6 Eveluete data collected in 

6-3 - 

6.7 identify and prioritise specif tc 
out co** go* la and object lv** 

6.8 Identify eaisttna) policies, pro- 
cedures, *nd laws dealing with 
positive achool cliaate 



Oci 



Dec 



Jan 



Feb 



Evaluecnr't Periodic Progres* Hating: 

I • Activity Initiated f • Satlafactorv Progrcee 
C - Actlvitv Completed " • Unsatisfactory frogrea* 



Where the same measures are used repeatedly at the 
same sites, periodic monitoring resembles a time-senes re- 
search design. This permits the evaiuator to form a defen- 
sible interpretation of the program's role in bringing about 
the changes recorded in achievement and attitudes. Using a 
control group whose progress is also monitored further 
helps the evaiuator to estimate how program students 
would be performing if there were no program. 

Unit testing. An evaiuator can focus on individual units 
of instruction or segments of the program that the staff has 
identified as particularly critical or problematic. In this 
case, monitoring of implementation will require in-depth 
scrutiny of the particular program component Under study. 
Realise the evaiuator 's ta»k is to determine the value of 
specific program components, the implementation of these 
components will need to be described in as detailed a way 
as possible. Achievement tests, attitude instruments, and 
other outcome measures will have to be sensitive to the 
objectives that units of interest address. This could make it 
necessary for the evaiuator to tailor-make a test, since 
general atrtude and achievement tests will be unlikely to 
address the particular outcomes of interest, in some cases, 
the curriculum's own end-of-unit tests can be administered. 
If you use curriculum embedded tests, however, be careful 
that they are not so filled with the program's own format 
and content 'idiosyncracies that they sacrifice generalizabi* 
tity to other contexts or make the control group's perfor- 
mance look misleadingly bad The occasions on v hich mea* 
surements are made for unit testing are determined by 
when important units occur during the course of the pro* 
gran Sampling of sites and participants should be done 
where it is inadvisable or impractical to measure all stu- 
dents, but representative subgroups of students or class- 
rooms can be measured or observed. 



4. This tabic hit been adapted from a formative monitoring 
procedure developed by Marvin C AJkin. 



ERIC 



BEST COPY 



76 



Aw/miiuj \ Handb ook 



Reports about the effectiveness of these ctilical program 
events should be delivered in time for modifications to In 
made in similar units, or in the same units in preparation 
for the next time a group of program participants encoun- 
ters them 

It the teaching of units cm be staggered at different sites 
so that not all students are being taught the ^jme unit al 
the same time, the.i the results of unit tests can be used to 
make decisions about the best way to tcacit that unit at 
sites where it has not yet been introduced. Using a control 
proup 'or unit testing gives quicker information about the 
relative effectiveness of different ways to implement the 
unit in question -mor^ than one version can be tried out at 
the same tunc and their relative effectiveness assessed. Umi 
testing with a control group amounts to the same tiung as a 
pilot or feasibility study. 

Pilot and feasibility studies. These arc usually under- 
taken because members of the program staff or :* planners 
luvc in mind a particular set of issues that they need to 
settle or a hard decision to make. Pilot and feasibility 
studies arc carefully conducted and usually experimental 
ct forts to judge the relative quality of two or more ways to 
implement a particular program component. Pilot studies 
could be undertaken, for instance, to determine the most 
effective order in whic i to present information in a science 
discovery lab or the most beneficial time to switch students 
m a hilin'.'tial program to an all-English reading group. These 
studies require that different competing versions of a pro- 
gram component he installed at various sites. The cvaluator 
first checks the decree to which each site carried out the 
program vjrufion it was assigned, then, after giving the 
\anahon tunc to produce the results, he tests for their 
relative iltcctitcncss. Like unit testing, feasibility studies 
demand measurement instruments that arc sensitive to the 
outcomes that the program versions aim to produce. They 
usualK demand random sampling since they use statistical 
tests to look for significant differences m the performance 
of groups experiencing different program variations. Pilot 
tests generally t;»kc plscc cither before the piogram has 
begun, or ad hoc throughout the course of the evaluation 
whenever controversy or lack of information creates a need 
to try variations of the program. 



Example 1. Dr. Schwartz, the university professor workine a* a 
formative cvaluator for educational television s.ation KDKC. 
overheard a conversation one da\ between iwo unieM working 
on an I'pi&cdc lor a senct on cultural awareness "Poverty and 
Pt/taiocs is a sill\ name lor an episode on Ireljnd." one writer 
wj* sapne. "Well, mavbe you tan do better: but I wy wc need 
catchy titles- things that will make the kids want to watch. " the 
other writer retorted. Dr. Schwartz offered to help >hc wr.tcrs 
f>nd such a title. Ht suggested that for caeh episode of the 
program they write lo* or five possible titles, lie .vould then 
construct and administer a questionnaire for student* in order to 
Ond out from the target audience which title would most cnucc 
them to watch a television program 

r sample 2. The Stone Cit> school board voted <n January to 
dismantle special ilassrooms lor the cdueaiionally handicapped 
and return I. II students to the reeular classroom beginning in 
September. EH students would spend most i their time in the 



regular classroom, but would be "pulicJ out %% enh da\ lo v«ork 
with a special education teacher a: a resource room in the 
school. The change would mean not only shifting students and 
altering the job roles ol special education teachers: it would 
demand establishing resource rooms in schools which did not 
previously have Ihein. 

Paced with tins large change of delivery ol special education 
services, the 'strict Director of Special education suggested 
that some pilot work done during the present school \ ear could 
prevent mistakes when mainst reaming took effect in the 
whole district in September. With board approval, she decided to 
phase mainstreaming into eight of the district's schools during 
the spring- 

Phase /. Two schools which aire idy had resource rooms 
would move EH students into the regular elassrocms in March. 
Tie Director of Special Education would carefulU observe and 
informally interview teachers, students, and special eoucation 
teachers at their two schools to identify major problems in- 
volved in the transition. She would then work out an instruc- 
tional package for teachers and parents that could be used to 
alleviate some of the problems and misunderstandings that could 
coincide with the organizational change. 

Phase ft In April, three additional schools would be mam- 
streamed, using the traimtic and counseling package developed 
during the fust phase. Again, the effectiveness and smoothness 
of the transition to mainstreaming would be assessed, based on 
observations and interviews with regular teachers, special educa- 
tion teachers, students, and parents. The April sample would 
include one school which had not used resource rooms in the 
past. This would give the director a notion of the eilct* •»! 
mainstreaming in situations where it represents an cu-n larger 
departure from regular practice. The training package would he 
revised based on feedback irom teachers and parents with whom 
it was used. 

Phase III, In May. three schools vouch had noi previously had 
resource rooms would be converted to mainst ream inc. llu i\ 
pcricncc of the firsl two phases hupclulh would make tins 
transition a smooth one. 

The Director of Special I ducation. haunc experienced %c\ 
cral months of work in mainstreaming iould s| v iul the summer 
prepaung materials for parents and training leathers to antici- 
pate September's reorgani/atur 



Pile and feasibility tests usually occur onk when the 
cvaluator offers to do them. Planners do no' usually ask to 
have this sort of service performed for them. A feasibility 
study need not, as well, be based on achievement outcomes. 
A common question it might address is "Will people Mke 
Version X better than Version Y?" 

V hatcver plan you use lor monitoring the programs 
effects, your efforts will allow the staff to make data-ba.sed 
judgments about whether program procedures arc having an 
effect on participants. Besides the outcomes dial planners 
hope to produce. y«»u will need to look vigilantly for 
unintended outcomes %idc-effcets that can K a sen bed lo 
the program but winch have not been mcnt. «ied hy plan- 
ners or listed in official documents. Although sidc-ehects 
are generally thought of as negative, they could «s easih V 
beneficial. You might discover, lor example potentially 
effective practices spontaneous!) implement*. J al - lew 
sites that arc worth exporting to others. Negative unin- 
tended effects arc important to discover if the program is to 
be improved. They highlight areas that require jdded atten- 
tion, modification, or even discarding. 

Remember thai when ihe djta \ou colics .iggcst revi- 
sions in the program, you will have to amend the program 



77 BEST COPY 



H ow iv PtdV the Rote OfFunnmm Vvnluaio T 



TABLE 4 

Contrasts Between Reports for Formative and Summative Evaluation 





Formative Report 


Summative Report 


Purpose 


Shows the results oi monitoring the 
program's implementation or of pilot 
tests conducted during the course of 
the proixam's installation. Intended 

the program that is not working as 
well as u miL'ht, or to expand a 
practice or special activity that 
shows promise. 


Documents the program's implementation 
either at the conclusion of a developmental 
period or when it lias had sufficient time to 
undergo refinement and uork smoothly 
Intended to put the program on record to 
describe it ^s a finished work. 


Tone 


Inf< rmal 


Usually formal 


Form 


Can be written or audiovisual; can be 
delivered to a group as a speech, or 
take the form of tn formal conversa- 
tions with the project director or 
staff, eta 


Nearly always written, although some formal, 
verbal presentation might be ma^«- to supplement 
or explain the report's conclusions. 


Length 


Variable 


Variable, but sufficiently condensed or summarized 
that it can be used to help planners or decision 
makers who have little time to spend reading at 
a highly detailed level. 


Level of 
specificity 


Htgh, focusing on particular activi- 
ties or material? used by particular 
people, or on what happened with 
particular students and at a certain 
place or point in time. 


Usually more moderate, attempting to document 
general proeram characteristics common to many 
sites so that summary statements and general, 
overall decisions can be made. 



statement as well. Program staff should take part in making 
these revisions, and consensus should be reached before any 
changes are recorded 1:1 the program's official description. 
New prograrp statements may also suggest revisions or addi- 
tions to . .uf contracted evaluation activities. 

Agenda D: Report and Confer With Planners md Staff 

The reporting mode for formative evaluation varies with the 
situation. As is shown m Table 4. formative reports almost 
never look like the more technical ones submitted by sum- 
mative evaluators. Most formative reporting takes place in 
conversations or discussions that the evaluator has with 
individuals or groups of program personnel. The form of 
your report will depend on: 

• The reporting style that is most comfortable to >c*j 
and th* staff with whom you are working 

• The extent to which official records are required 

• Whether you will disseminate results only among pro- 
gram sites, or to interested outsiders and the general 
community as well 

• How soon the information must reach its audience in 
01 Jcr to be useful 

• How the information will be used 

Whether reports are oral or written is up to you. If 
additional planning or program modification will be based 



on the reports you give, then it is best to discuss program 
effects with the staff, perhaps at a problem solving meeting, 
so that remedies for problems can be debated and decisions 
made. 

A written report provides a documentation of activities 
and findings to which the audience can continually refer 
and that can oe used in program planning and revision. 
Written reports, however, take time to draft, polish, discuss, 
and revise. Tins is time that might be better spent collecting 
information and working on program development with the 
staff. In many cases, the best way to leave a written uace of 
tile '»sults of your formative findings will be to periodically 
revisi the progtam statement you produced^ paU»vf 
Agenda Br 

Face-to-face meetings provide the staff and planners 
with a forum for discussion, clarification, ind detailed 
elaboration of the evaluation's findings as well as the oppor- 
tunity for making suggestions about upcoming evaluation 
activities. During conversational reports, you will be able to 
make requests for assistance in solving logistical problems 
or collecting data. Staff members might also want to ex- 
press their problems or suggest new information needs. 

A schedule for interim reports should be part of the £*z'H*f r /pe~ 
evaluation contract. The program staff should indicate 
when or how frequently they wish to review the results of 
each evaluation activity. Interim reports on the progress of 
program development should contain results of completed 
evaluation components, a rcitet tton of tasks yet to be 



3 

ERIC 



BEST COPY 



76 



LvuiuutJt a ifa t hl book 



accomplished, and a full description and rationale for any 
changes in your responsibilities that may have to be nego- 
tiated. 

Formative evaluation reports can include feedback of 
different sorts. At minimum, such a report will simply 
describe what the format i v c cvaluator saw tak'ng place - 
what the program looked like ar*d what achievement or 
attitudes appeared to be the result. Depending upon his 
presumed expertise in such matters, the formative evaluate? 
may also make suggestions about changes, point to places 
where tit? program is in particular need, and offer services 
to help remedy these problems. Yotir contributions along 
these lines will depend on your expertise and the contract 
you have worked out with the planners arid staff. 

If your evaluation service has focused on pilot or feasi- 
bility studies, then your report will follow a more standard 
outline, although you may supplement the discussion of the 
results with recommendations for adaptation, adoption, or 
rejection of certain program components and perhaps out- 
line further studies that are needed. 

The tentative nature of instructional components in the 
formative stages of a program should be a recurring theme 
in your conversations with and reports to the staff. You 
will find that once i wh a r s my staff arc comfortable with 
program procedures, they wrll want to avoid making further 
chances in the program. The formative e valuator will have 
to make a conscious effort to keep the staCf increstcd in 
looking at program materials and procedures with a view 
toward making them yet more appropriate, effective, and 
appealing for the students. Although valuators will have 
ihc responsibility of overseeing the collection of informa- 



tion to support decisions about program revisions, the sug- 
gestions and active involvement of teachers in this decision- 
making process is crucial. Everyone on the program siaff 
should understand why the formative evaluation is occur- 
ring and should be encouraged to take part. 



For Further Reading 

Alkin, M. C, Daillak, R., & White, P. Using evaluations- 
does evaluation make a difference? Sage Library of 
Social Research. Beverly Hills, CA: Sage Pubns., 1979. 

Baker, £. L, & Saloutos, A. G. Evaluating instructional 
programs, Los Angeles, CA: Center for the Study of 
Evaluation, 1974. 

Havelock, R. G. Planning for innovation through dissemina- 
tion and utilization of knowledge. Center for Research 
on Utilization of Scientific Knowledge, Institute for 
Social Research, University of Michigan, January, 1971. 

Lichfield, N., Kettle, P., & Whitbrcad, M. Fvaiuation in the 
planning process. Oxford: Pergamon Press, 1975. 

Nash, N., & Culbertson, J. (ids). Linking processes in 
educational improvement. Columbus, Oil University 
Council for Educational Administration, 1977. 

Patton, M. Q. Utilization-focused evaluation. Beverly liills, 
CA: Sage Publications, 1978. 



BEST COPY 

79 




I '■ Stcp^yStep Guides 
' Fok' Conducting 

Evaluation 

Chapter Two lists some of the myriad jobs of an evaluator. 
Keep in the general differences between the roles of a formative 
eval .tator and a summative evaluator. The goal of a formative 
evaluator is to collect and share with planners and staff 
information that will led to improvement in a developing program. 
A summative evaluator has the responsibility for producing an 
accurate description of the program, complete with measures of 
its effects, that summarizes what has transpired during a 
particular time period. Results from a summative evaluation, 
usually compiled into a written report, canbe used for 
sever a 1 purposes: 

* To document for the funding agency that services promised 
by the program's planners have indeed been delivered 

* To assure that a lasting record of the program remains on 
f i le 

* To serve asa planning document for people who want 

to duplicate the program or adapt it to another setting. 

The sometimes idiosyncratic nature of evaluations may maf e a 
step-by-step guide seem unnecessary, and in truth, there is no 
step-by-step way to perform tasks involved with the role. 

Enough activities are common among evaluations, however, to 
permit a general outline of what needs to be accomplished. 
Chapter Two describes four Agendas to which an evaluator must 
attend to so^e degree. These agendas are: 

* Agenda Set the Boundary of the E^a^uation. That is. 
negotiate the scope of the data gatnenng «: t ivities in which you 
willengage, the aspects of the program on which you will 

er|c *~ 1 *o BEST COPY 



concentrate, ?nd the responsibilities of your audience to 
cooperate in the collection o-f data and to use the information 
you supply. 

Agenda Bi Select A^Qr gnr i. at & Evaluation Design and 
Measurements. In this case, you determine specifically what is 
to be measured, when, and how. The analysis to be employed is 
also planned at this stage. 

Agenda Qi Data Collection and Analysis. This includes 
administering all the planned data collection instruments to 
appropriate groups, compiling and analysing the data and 
selecting an appropriate reporting form. 

Agenda D: Final Report. Given the original purpose of tne? 
evaluation, plan and execute a reporting strategy for the 
appropriate audi ences. 

As Chaoter Two mentioned, many of the tasks falling within 
the scope of the different agendas will actually occur 
simultaneously or in an crder other than that described by the 
guide. In general, however, triers will be some logical order to 
how the evaluation unfolds. Consider the guide as a loose map of 
the activities vou mi ght perform. 



0 

ERIC 





li you arc working as a formative evaluator for the first 
time in the setting, your best guidance might come from a 
conversation with someone who has evaluated the program 
bcfoie or who has served as formative evaluator in a similar 
betting. If formative evaluation presents a change in the 
evaluation role to which you are aceiitomed, then seek out i 
someone who has dwiie it before. Nothing beats advice from 
long experience. j 

Whenever possible, the stcp-by-step guides use checklists 
and worksheets to help you keep track of what you ha v e 
decided and found out. Actually, the worksheets might be 

better called "guidesltcets," since you will have to copy 1 
many of them onto your own paper rather than use the one 1 
in the book. Space simply docs not permit the bonk to 1 
provide places to list large quantities of data. I 

As you use the guides, you will come upon references 
marked by the symbol ^gj£. These direct you to read ' 
sections of various How To books contained in the Pro^nm 
Evaluation Kit. m these junctures in the evaluation, if will I 
be necessary for you to review a concept or follow a 1 
procedure outlined in one of these seven resource books: 1 

• How To Deal With Goals and Objectives I 

• How To Design a Program Evaluation \ 

• How To Measure Program Implementation \ 

• How To Measure A ttitudes 1 

• How To Measure Achievement \ 

• How To Calculate Statistics \ 

• How To Present an Evaluation Report \ 



9 

ERLC 



BEST COPY 

2-3 52 




Set the r^unaaries 
Pbr h < nation 



Instruction! 



Agenda A sr^ompassss ihs (valuation planning 
period — f row thn cis* you accspc tha Job of ' 
>ltofc tviluttor until you bagin to actually cir7 
out t\ uaslgnnsnes dictated by tha rol«». .-Juc.i 
of Agenda A aaounts to gaining *n undarscaudlni 
tha prograa and outlining «ha sarvicas you can 
parfor \ than nagotiatirg nnea vith tha nssbars 
of tha staff who will us* lnforaatlcn. 

A*a f iv« steps and tvalva subfctsp* ara 

outliiuJ by this flowchart: 



lie 



Defaming 
S2Lj 





flffl) OUT / | 
UCH AS YOU ' 



ABOUT THE 



I 



Collect and 
scrutinize 
wrtttsn docu- 

ttits that das-* 
criba the pro- 
grasj 



b. -alk to 
paopla 



D 



FOCUS THE EVAL- 
UATION 



J- 



Judge tha 
adequacy o» th« 
avallabls 
vrittsn docu- 
ments Cor das- 
cribing the 
protri 



b. Visuaii** 
what you night 
do as i.inaativa 
^valuator _ 



Assess your 

tea fttfnfttou 



d. Think of how 
you can cur- 
costs 



negotia;s your 

ROLE i 




t about 
c out- 
tns 
on 


a. Agrsi 
the basi 
line of 
evaiuati 



b. Sta^ alert 
for ' vo poten- 
tial snags lr 

valuation 
.x of pos- 
sibility for 
change; con- 
flicts In your 



EST I MAT! HOW 
MUCH THE EVAL- 
UATION WILL 
COST 



o 



COME TO AGREE- 
MENT AIWUT SER- 
VICES ASH RS- 



a. Cosjp* a 
cos t-of -staff 
per-unit- tine 
figure for «ach 
Job tola occu- 
peti by Muoni 
who will work 
the evalua- 



b. CalcuUce a 
first estlsate 
of who will 
work on tha 
evaluation, and| 
for hov I inj 



c. Estinats tha 
evaluation^ 
total cost 



sues 



9 

ERLC 



3-r 



83 



B<3ST COPY 



Step-by Step Guide for ConductingftS umrruitive Evaluation 



Determine the purposes 
of the evaluation 



Instructions 



: \ r The job of the wmmmv* t evaluator is to collect, 
^'".digest, and report Information about a program to 
satisfy the needs of cne or more audiences* The 
' audiences in turn might use the information for 
4 ; v either of three purposes: 

• To learn *bout the program Oh t+S 0Ornpon€ryh 

To satis' 7 th.sseives that the program they were 
. promise** did Indeed occur, and if not, what L. /t . 

To ma *s decisions about, xontlnuljig or discon- 
tinuing. exp;<idlng or fi m flPfla/ Che program* 
n B cr a lly uLrcngh giving s t» irtrhhnlrHnc fnn d g^ 

"', --^v If V *decislorr* hlnge4*cn your f indings^your first 
Job is to find out wiat the decisicpSwk Then you 
' will have to ensure chat you collect the appro- 
;V prlate information \nd report it to the correct 
- - * sudlencss. 



□ 



& gin your descriptions of che 
decisions to be made and your 
audlence(s) by answering th* 
following questions: 



{""1 What is the title of the program to be eval- 
uated ? . 



Throughout this chapter, this program will be 
referred to as Program X . 

Q What decisions will be based on the evaluation? 



• District Personnel^ 
Report due 



• School Board_ 
Report due 



• Superintendent 
Report due 



• State Department of Educatlon_ 
Report due 



V s o wants to know about the program? That is, 
who is the evaluation's audlencg?^ 

• Teaches 



Report due_ 



• Administrators^ 
Report due 



* Counselors or department heads_ 
Report du*» 



• Federal Personnel, 
Report due 



• Parents 



Report due_ 



• Community in general^ 
Report due_ 



• Other — special interest groups, for Instance 



Report Jue_ 



Try not to serve too many aud onces 
at once. To produce a credible 
Sttmmee4vf evaluation, your position 
^nst allow you to be objective. 
A r m *i K s f H i l j Hutiilljuuh. fm ilabu r ag ^ 
- sduiuiilllimc 



J^F (pus OAt- do"jdLU£+\»-Ci cu&Us+\>v\3rio< utiladiKP) 

yfsk the people w" constitute your primary audi- J 
ence this question: 

| j What would be done if Program X were to be 
found inadequate? 



Here name another piogram or the old program , 
or indicate that they would have n o program 
at all . What you enter In this blank Is che 
alternative with which Program X should be 
compared. Vherc could be .aany alternatives oi 
competitors; but select th^ most llkel, 
alternative. 



-ERIC i 



3"f 



84 



BEST COPY 



Evaluat* '5 Handbook 



Ttils nost-likcly-altcrnative-to-Program-X^ its 
closest c ompetitor, is referred to throughout 
this guide as Program C . Vrica it after the 
word "or" in the next sentence: 

A choice must be made between cont nuing Pro- 
gram X . . . 




This is Program C. 



If at all possible, set up or locate a control or 
comparison group which receives* Program C . 

cfltWe- Chip Li l TWfr a— WigWi 

Egtfgg gfffclu 'on. for Idm.aiwiw 

tei 1 describas evaluation designs, 
so<ne of them fairly unorthodo> , which night be 
useful for situations where coi.trol groups are 
difficult to sec up Using a control group 
greatly increases the interpretability of your 
information by providing a basis of comparison 
from which to judge the results that you obtain. 
Pages 24 "o 32 of r,he same book describe different 
s s of control groups and the programs they 
might r ccive. 




l .li3 jne of the ev 1 uat ion's audiences, such as 
a Federal or State funaing agency, stated 
specific rcquircoenrs for this evaluation? Are 
yet squired, for instance, to use particular 
.tfSts, to measure attainment of particular out- 
comes, or to report on special forms? If so, 
sirsnarlze these evaluation requirements by 
quoting or referencing the documents that 
sti ulate them. 



L ~" ^ 

pooler. 



What is the absolve deadline for 
tha earliest evaluation report? 
Record the earliest of the dates you 
listed when describing audiences. 



Evaluation Report must be ready by_ 




8- 



gERJC 



BEST COPY 



•f 



Step-bv-Step Guide for Conducting &§trmmatry Evaluation I 

! Step * 

Fitid out as much as you can 
about the program (s) in question 



Instruction. 0 



SLm Scrutinize written documents that describe 

Program X, Program C, or both: 




PI \ program proposal written for the funding 
agency 

I | The r*iu*;«?t for proposals (RFP) written by the 
rponsor or funding agency to which this pro- 
gram's proposal was a response 

Q Results of a needs assessment* whose findings 
the program is intended to address 

Q Written state or dist* ct guidelines about 

program processes and goals to which tuis pro- 
* gram must conform 

Q The program's budget , particularly the part 
that mentions the evaluation 

t Q A description of, or an organizational cK^rt 
*' depicting, the administrative and staf f roles 
,j flayed by various people in the program 

["1 Curriculum guides for the traterials which have 
taen purchased for the piog-am 

. n Past evaluations of thi~- ~r similar programs 

1 - | 1 Lists of goals and objectives which the staff 
or planners feel describe the program's aims 

\, D Tests or surveys which tie program planners 
' ', feel could be used to measure the effects of 
Y the program* such as a district-wide year end 
assessment instrument 



n Tests or surveys that *ere used 

□ 



[ ] Articles In the educatio* .ad evaluation liter- 
ature that describe -he effects of programs 
such as the cne in question, its curricular 
materials, or its vanoi'3 subcomponents 

□ Other 



Once you have discovered which materials are 
available, seek them out and copy them if pos- 
sible. 

Ta*ce notes in the margins. Write 
down, or dictate onto tape, consents 
about your general impression of the 
program, its context, aad staff. 
This will get you started on writing your own 
description of the program. You may want to com- 
plete Step 3 concurrently with this general 
overview. Be alert, in particular, for the 
following details; 



□ 



The program's major general goals. List sepa- 
rately those that seem to be of highest 
priority to planners, the conraunity, or the 
program's sponsors. Note where these priori- 
ties differ across audiences, sirce your report 
to e*ich should reflect the prioriti s of each. 



[~") Specifically stated objectives 



Q The philosophy or point of view of the program 
planners and sponsors, if these differ 



I 



e used Hy the pro- . 
W a formative evaluator, if +kt%M aJdfrO^pw 

eraos, meeting minutes, newspaper articles — 



taff or the planners 



descriptions made by th' 
of tha program 

Q Descriptions of the program's history, or of 
tUo social context into which it has been 
designed to fit 



□ 



Examples of similar programs that planners 
intend to emulate 



□ 



*A needs assessment is an announcement of educa- 
tional needs, expressed In terms of the school 
curriculum a.id policies, by representatives of the 
school or district constituency* 



Writers in tne field uf t Juuil iu i ^ whose point 
of view the program is intendeo to mirror 



se BEST copy 




tD The needs of the community or constituency 
which the program is Intended to meet — whether 
Hese have been explictly stated or seem to 
*.<plicitly underly the program 



Program implementation directives snu cequire- 
ments, described in the proposal > requir-rf by 
the sponsor, or both 



□ 



The amount of variation tolerated by the pro- 
gram from site to site, or even student to 
student 



□ The number and distribution of sites involved 



Q Canned or prepub'ished curricula to be used for 
the program 



□ 

Curriculum materials to be constructed in-house 



Plans which have been developed describing how 
the program looks in operation — times of day, 
scripts for lessons, e-rc. 



□ 

Administrative, decision-making, and teaching 
roles played by various people 



Q Staff responsibilities 



□ 



Descriptions of extra-Instructional requ 
ment? placed on the program, such as the need 
to obtain parental permissions or to include 
t-icher training or community outreach activi- 
ties 



I"! St'jdeni evaluation plans 



CJ Teacher evaluation plsns 



Q Program evaluation plans 



□ Descriptions of program aspirations that have 
been stated as percents of students achieving 
certain objecti ">s and/or deadlines by wM<h 
particular objeccives should be resched 



Dm Talk to peopl« L y y , 

OaceUou have arrived at a set of initial iraprcs- 
sionsV check these— and your germinating evalua- 
tion plans— by seeking out people who can give you 
two kinds of information: 

• Advice about how to go about collecting forma- 
tive information for a program of this sort 

• Answers to your questions about what the pro- 
gram i supposed to be and do— including which 

• and how much modification can occur based on 
/our findings 

^heck your description of the program against the 
impressions and aspirations of your audiences and 
the program's plsnners end staff. By all means, 
contact the people who will be in the best posi- 
tion to use the information you collect, your 
primary audience. 

Try to think st cnls tfm ; of other people whose 
actions, opinions, and decisions will Influence 
the success of the evaluation and the extent to 
which the information you collect will be useful 
snd used. Make sure that you talk with each of 
the«e pecple, either at a group meeting or indi- 
vidually. Seek out in particular: 

Q Evaluators who have worked with this particular 
program or programs lik* it. They will have 
valuable advice to give about what information 
to collect, how, and from whom. 



□ 



□ 



School or district personnel not directly con* 
nected with tM project, bet whose cooperation 
will help you carry out the ^valuation more 
efficiently or quickly. Negotiate access to 
the programs ! 

Influential parents or community members whose 
support will help the evaluation go more 
smoothly 



C2 P/anu /en/ >n"$MwiQ&. ■/■ydi/uhq 



ERIC 



18', 



3-b 



I BEST COPY 



+ Q the project director 



(s) 



* D Te * cb *™» particularly those who seer to have 

awt-t influence 

Program planners and designers 

• D Curricula consultants to the projt. t 
D Memfcera q f ad/laory - omuittees 

Q Influential or particularly helpful students j 

□ 

The people who vrcte the proposal 



If they are too busy to tali , send 
memos to key people. Descr be ''he 
evaluation, what you would i*e th 
to do for you T and when. 

I-i your meetings with these people, ^ou i lould 
eotscunicate two things: 

c Whc you are and why you are foiutlvtli evalua- 
ting the program 

• The importance of your staying in contact with 
them throughout the course of the evaluation 



i em 



Tnev. in tun*, should point out to you: \ 

• Areas in which you have misunderstood the pro- 
gram's objectives or its description 

• Farts of the program which will be alternatively 
emphasized or relatively disregarded during the 
term of the evaluation 

• Their decisloi about the boundaries of the 
cooperation they will give you 

Keep a list of the addresses and 
phon numbers of the people you Nave 
contacted with notations about the 
bei»t time of day to call then or the 
tiroes when it is easiest for them to attend 
meetings. 

If possible, jbserve the program in 
operation or programs like it. Take 
a f *eld trip in the compan> of pro- 
gram planners and staff. Have them 
point out the program's key components and major 
variations. 





Tokc carecul not?s of everything you 
see and hear. Liter fox 
sone *»f thes? valuable. 



^jjjjjjjjjj^^^ see and hear. Liter yov. D3Y t ind 



BEST copy 



33 



3-9 



Step 3 
Focus the evaluation 



Instruction! 



Jt dge the adequacy of the available written 
document! to: describing the program 



Make a not* of your impressions of 
the quality and specificity of the 
program's written description. 
Anaver these questions i.y particular: 



* Art the written documents spsclflc cnoufh to 
give you s good picture of vhst will hsppsn? Do 
thty suggest which components you will evaluate 
and what they will look like? 

D y«* C 00 D uncertain 




ft Visualize vhst you might do aa foraatlve 
e valuator 



Bass this exercise upon your impres- 
sions of the prog 'am: 



• Which components appear to provide the key to 
whether it sinks or swims 7 




* Which components do the plannera ano staff moat 
eaphaalze aa being critically important? 



• Which are likely to fall? Why? 



• Have progrsm plsnners written a cle: rationale 
describing why the particular activities , pro- 
cesses , mater isls, and administrative arrange- 
ments in the program will lead to the gosla and 
objectives specified for the progrsa? 

n yes CD no uncertain 



* What might be missing from the program as 
planned that could turn out to be critical for 
Ite succeed 



• Where Is the program too poorly planned to merit 
succeaa? 



• Is the program that Is planned, anl/or the goals 
snd objectlvea toward which It *ims, consistent 
with the philosophy or point of view on which 
the program la based? Do you note misinterpre- 
tations or conflicting interpretations anywhere? 

D veB D 1,0 D uncertain 



• Which atudent outcome 
easiest to secompllsh: 

difficult? 



.11 it probatiy be 
hich will be most 



What effecta might the progrsm luwe that iti 
plannera hsvs not anticipated? 



If your answers to any of these questions Is no 0* 
1 . uncertain , then you will hsve to inrluie in JT*^. 
l/Y fVi, evaluation plana discussions with the planners and 
staff to persuads them to se«. down s clear state- 
ment of the program's goals and rationale. 



While conducting this exer cls fc ~y yourself d< not 

D3 sfrsld of being hsrd on the procrag . it is 

your Job to foresee pote: . in* prcWlests that the 
program's planners might overlook* 

When you chirk about the eervir* you can provide, 
you will, cf rours*, 'eed to consider two impt *- 
tsnt things besides program 



"ouf - u -n ps » t4cw4st^- f tr^mmt 1 ^ - 



BEST COPV 



/0 



- ^ outcome*. These are the budg et, which you 
will work out in Agenda B, and .our own 
particular s trengths and talents . 




* v .i-i Assess your own strengths 

You will bet t benefit the program in 
EC60N#| chose areas wb« e your visualization 
in Step 3b mat' tu.s your expertise, 
r. t. You should ''tune 11 the evaluation to 

jt build on your skills as : 

„i. (*n A researcher 

I | A group process leader or organizational 
facilitator 

CI A subject matter "expert" — perhaps a curric- 
ulum designer 

n A former teacher in the relevant subject areas 

n An administrator 

[ | A facilitator for problem solving 

I | A counselor or therapist 

I | A linking agent (See page 27, Chapter 2) 

n A good listener or speaker 

("1 An effective wri er • 

I I A synthesizer or conceptualizer 

I | A disseminator ot information, or public r i- 
tions promoter 

□ Other 



~7 Z tL Think of how you can cut cos is 



Since the services you ccr. envision 
providing probably exceed your 
b'»dg< K 9 tliin't of how you can cut 
c jts. S— P» e p 2s*f p a ^ a 



5-7/ 



90 




'J 



Step 4 
Negotiate your rote- 



Instructions 



Chapter 2 presented a general outline of the tasks 
that often fall withiir lWc OumLluii evaluator's 
role. You will have to work out your own Job with 
your own audience. Meet again aiii confer with the 
people whose cooperation will be necessary— those 
whose decisions about the program carry most influ- 
ence and who will cooperate when you gather infor- 
mation. You may, of course, also want to meet 
with other audiences. 



• Which achievements and attitudes are of highest 
priority? 



• On which achievements and attitudes do you 
expect the program to have most direct and 
easily observed effect? 



• Does the program have social or political objec- 
tives that should be monitored? 



J| # Agree about the basic outline of the evalua- 
tion 



CHECK 



n Agree about the program characteristics and 
outcomes th** will be your major focus— regard- 
less of the prominence given them in of f ic ial 
program descriptions. Ask the planners and 
staff these questions: 

• Which characteristics of the program do you 
consider host important for accomplishing its 
objectives' Might you have implemented it in 
a different way than is currently planned? 
Would you be willing to undertake a planned 
variation study and try this other way? 



• What components would you like the program to 
,iave which are currently not planned? Night we 
try some of these on a pilot basis? 



• Are there particularly expensive, troublesome, 
contro ersial , or dif f icult-to- implement parts 
of the program that you might like to change or 
elimlntre? Could we conduct some pilot or 
feasi' 1 lty studies, altering these on a trial 
basis t some sites? 



r") Agree about the sites and people from whom you 
will collect information. Ask these questions: 

• At which sites will the program be in operation? 
Cow geographically ispersed arc they? 



• How much does the program as implemented vary 
from site to site? Where can such variations 
be seen? 



* Who are the important people to talk with and 
observe? _ 



• When are the most critical times to see the 
program — occasions over its duration* and also 
hours during the day? 



• At what points during the course of the program 
will it be beat to neasute student progress, 
staff attitudes, etc? Are there logical 
breaking points at, say, the completion of 
particular key units or semesters? Or does the 
program progress steadily, cr each student 
individually, with no best time to measure? 




• Would it be bettor to monitor t!w program as a 
whole "periooicni-J y . or should the effectiveness 
of various program subparts be singled out for 
scrutinizing, or both'' 



How To Mmurt Program Implementation 

describes wivs to use records kept 
during the program to back up des- 
criptions of its implementation. 
~ee pages 79 to 88. 





More detailed description of sam- 
pling plans is contained in How To 
Haaaurt Prograa lapleaantation . 
pages 60 Co 64. How To Dcalgn a 
Program Impl— anCaClon , pa*** 33 Co 45, describes 
decisions you might make about when to make 
measurements. 



< O Agree about the orrt tne staff will play in 

collecting, sharing, ana providing information. 
Explain to the staff that its cooperation will 
allow you to collect richer aad coie credible 

* information about t" ~m~ with a clearer 

message about what netua wO be done. Ask: 

• Can records kept during the program as a natter 
of course be collected or copier, tr provide 
inform.it ion lor the i*v.i luat tor." 



Cm rccord-keepl r systems be established to 
give me needed iff ormat ion 7 



Q Agree about the extent to wnirh vou will he 

able to take a research star.ee toward the eval- 
uation . Find out . 

• Will it be possible ~o set up control groups 
with whom program progress can he compared'' 



f ill it be possible to establish a true control 
6 roup design oy randomly assigning participants 
to different variations of tne orogram or to a 
no -program control group? Will it be possible 
to delay introducing the program at some sites? 



Can non-equivalent contiol groups be formed or 

located" 1 



Will I have a chanc^ to make rae suroments ^nor 
tc the nnoram nnd/rr oftcr en. » to set up a 
tine scries design 1 



- an 4 sta 



ff-Xe^a'-le 



to snare 



a e hit v e m cn t information w*.th me or hel^ with 



Its collection? Arc willing to atlmlr * ster 
periodic tests Co samples of r t^ f/t XUtdujfc 



Will I be able to a good design to unde 
pilot tests or feasibility studies'* 



WlII I be nblj or required to conduct m-depth 
c;»se studies at some sites? 



Will aff memoers r^e iKi g enn nht e to 
attend brief evaluation meetings or evaluation 
planning sessions? 



Will yc-j be willinp, and able to take part in 
planned program /a.iations or pilot rests" 
VUl you be willing to respond to altitude 
aurvcys to determina the effectiveness of pro- 
gram components'' 



iAsed on the information T collect, will vou le 
billing to sppid tin* on modifying tfn* prcrr.ir 
through new instruction, lessons, orgnntza- 
r'oouu or stiffing patters 1 



Are tou willing to .idopt a formative wiit-i, id- 
see experimental attitude 'nward t'.e program 7 




Jfeuail* about th- use^of designs-**- <ZOJ 10? 0!'^ 
for m a e m. n jlu.i ^-tru arc discussed in — " 
' How To Oaaign a Prograa Evaluation . 
Sea in particular pagat 14 co 19 and 
46 to 51. Case studies are discussed in How To 
Mat ire Program Iapl—ancaclon . paias 31 ind 32, 

Q^rie about the extent to which \.ju will need 
to provide other services. Ask the staff and 
planners these questions: 

• Do you need consultative help tl.it stretcher 
mv role beyond c llecting fomtl'e data' 1 Ho 
you u.int rv jd« ice nbout program n./d i f I .it u»ns, 
for instance 7 Or hrlp wlt'i solving prr^onnel 
probtens 



BEST copy 



Evatuator't Handbook 



• Do you want me Co servp to sonic degree .»* a 
linking agent? Si.o-Jd I, tor instant, conduct 
literature -cievs, seek consultation from 
similar projects, or search out services or 
additional people or funds to helo the project 7 



How much ln* r rat i onal in J currlqulp r chan ge 
will you tolerate in tl.2 ^og^am bc> ou<J lis. " 
current at ite? Would you be willing to delete, 
add, or a'ter the program's objectives? To 
what extrnt would you be willing to change 
books, nr'erlals, and otner program components? 
Are you willing to rewrite lessons? 



• Should I tak* on e public relations roll? /Ill 

you want me to s^rve as a spokesperson for the 

project? To give talks or write a newsletter, Include additional "what If..." questions that are 

for examo!*" more specific to the program at hand. 



Stay alert for two potential snags in carrying 
out the evaluation 

• Conflicts In your own responsibilities to the 
program and the sponsor 



n look out for conflicts In your own role . If 
your job requires that you report about the 
program to Its sponsor 01 co the community at 
large, staff members are likely to be reluctant 
to share with you doubts and conjectures about 
the program. Since this will hamper your 
effectiveness, you will do best to explain to 
the planners and ataff the following: 



Q 




JUU uiV nut. in 

pa s t t hat Ju d g e s 



Lei id 
and fi nds 



wi- 



the part of planners or staff . It will be 
fruitless to collect data to modify the program 
If someone will resist modifications. Before 
you begin scrutinizing the progrsm or Its 
various components, then, you should find out 
where funding requirements, staff opinion, or 
the political surround restrict altering the 
program. Ask In particular the following ques- 
tions: 

On what are you most and least willing, or con- 
strained, to spend additional money? What 
materials, personnel, or facilities? 



• Where would you be most agreeable to cutbacks ? 
Can you, for Instance, remove personnel? If 
particular program components were found to be 
Ineffective, would you eliminate them? Which 
books, materials, and other program components 
would you be willing to delete? 



• Would you be willing to scrap the program as 
it currently looks and start over? 



***See 
below 



^r o gm m x . Outline the form and some of the 
message that the report will contain. 

and /or 

• That the planners and staff will have a chance 
to screen reports that you submit to the 
sponsor. 

and /or 

• That you are willing to write a final report 
describing only those aspects of the program 
chosen by thfa staff. 

snd/or 

• That you sre willing to swear confidentiality 
about the Issues and activities that the eval- 
uation addresses. 



If you are in a hurry, and you think 
that you need to purchase instru- 
ments for the evaluation, then get 
started on this right away. >Cet nmV t- 
Sttus and C* . and t he nlcvm Hew Ty »s uhsi 




xmnA pxrffci apacian mats as 



-posi 



** The following questions are most pertinent 

• How much administrative or staff reorganization in a format ive evaluation, 

will you tolerate? Can you change people's 
roles? Can you add to staff, say, by bringing 
li vc* nteers? Csn you move people— teachers, , 
even . '*nts — from location to location perma- 
nently u fc temporarily? Can you reassign stu- 
dents to different progrsms or groups? 



BEST COPY 




StGp#- Iff 



Estimate the cost 
of the evaluation 



Instructions 



If your activities will be financed from the pmgnra 
budget, you will have to determine early the finan- 
cial boundaries of the service you provide. The 
cost of an evaluation is difficult to predict ac- 
curately. This is unfortunate, since what you will 
be able to promise the staff and planners will be 
determined by whrt you feci you can afford to do. 

F.stimaie costs by getting the easy ones out of the 
way f. st. Find out costs per unit for each of 
these "fixed" expenses: 



f~l Postage and shipping (bulk rate, 
parcel post, etc.) 

I I Photocopying and printing 

I I Travel and transportation 

1 | Long-d istanf o phone calls 

□ 

Tost and instrument purchase 
□ O ; UdntJ 

i.anical tost or questionnaire 

scoring 

f~] Data processing 



These fixed costs will come "off the top" each 
time >ou sketch out the budget accom[ nying an 
alternative method fni evaluating the program. 

The most difficult co^t to estimate is the mnst 
important one: the price or pcrson-hrmrs required 
for your servicrs and those of the staff you 
assemble for the evaluation. If you arc inexper- 
ienced, try to emulate other people. As^ how 
other t valuators estimate ensts and then do 
likewise. 

fWelop a rule-of-thumb that computer the co-.t 
of eat 1 1 tyre nt evaluation staff memhor per unit 
time period , surh as "It .-«sts $4,5()0 tor one 
seninr evaluator, working full time, per mouth." 
fins t i>;ure should sumnari/.e all expenses of the 
ev Mini, c\i hi ding only o.'crlic.id cosls unique 
U utieular ud> — such as travel ami data 
ana ./sis. 



The staff cost per u nit figure should include: 

Salary of a staff member for that time unit 
+ Benefits 



+ Office and equipment 
rental 

+ Secretarial services 

+ Photo copying and 
duplicating 

+ Telephone 

+ Utilities 



This equalc the total 
routine expenses of 
running your office 
for the time unit in 
question, divided by 
the number of full- 
time cvaluators 
working there 



Compute such a figure for each salary classifi- 
cation — Ph.D's, Masters' level staff, data 
gatherers, etc. Since the cost of each of these 
staff positions % ill differ, you can plan va.'J - 
ously priced evaluatinns bv juggling amounts of 
time to be spent nn the <»*'iluntion bv staff 
members in different salary brackets. 

Ine t * isks vou promise to perform will in turn 
determine and be determined by the amount of 
time you can allot to the evaluation fr~m dif- 
ferent staff levels. An evaluation will cost 
more if it requires the atlcntion of the most 
skilled and highl> priced evaluators on your 
staff. This will be the case with studies re- 
quiring extensive planning and complicated analy- 
ses. Evaluations tha: use a simple design and 
routine d.^to coK^ctiou by griduatc students or 
teachers will be correspondingly less costly. 



In estimate the cost uf your eval- 
uation, try these steps: 



J| # Compute a eost-of-staf f -por-uni t-tiroe figure 
for each jnb position ni copied bv someone who 
will work on the cviluntion . 

Depending on the amnunt of t> iekup staff suppnrt 

entered into the equation, tins figure could be 

as hifcU r.s twici- the gross salary earned by a 
person in that position. 



| # Calculate a first estimate of which sfaff 
member s * ser j i co«. will *>r rcqu i red for ? h e 
cvaluation T ami how lunp, each will need to 
vark. 



ERLC 



BEST COPY 



3~H 



94 





C« Estimate the evaluation's total cost . 

Refer to the proposed time span of 
the evaluation. Be sure to Include 
fixed costs unique to the evaluation 
— travel, printing, long-distance 
phone calls, etc. — and your Indirect or overhead 
costs, if any. Discuss this figure with the fund- 
ing source, or compare it with the amount you know 
to be already earmarked for the evaluation. 

Rather than visiting an entire popu- 
lation of program sites, for instance, 
visit a small sample of them, perhaps 
a third; send observers with checklists 
to a slightly larger sample, and perhaps send ques- 
tionnaires to the whole group of slt>ts to corroborate 
the findings from the visits and observations. See 
if one or more of the following strategies will re- 
duce your requirment for expensive personnel time, 
or triir some of the fixed costs. 

0 Sampling 

n Employing junior staff members for some of the 
design, data gathering, and report writing tasks 

Q Finding volunteer help, perhaps by persuading 
the staff that you can supply richer and more 
varied information or reach more sites if y:>u 
have their cooperation 

1 I Purchasing measles rat jer than designing your 

own 

n Cutting planning tine by building the evaluation 
on procedures that you, or people whose exper- 
tise you ^an easily tap, have used before 

I I Consolidating instruments and the times of 
their administration 

C) Planning to look at different sites with dif- 
ferent degrees of thoroughness, concentrating 
your efforts on those factors of greater im- 
portance 

I I Using pencil-and- paper instruments that can be 
machine read and scored, where possible 

n Relying more heavily on information that will 
be collected by others, such as state-adminis- 
tered tests, and records that are part of the 
program 



y£ 3 /t 95 BEST COPY 



"Instructions on How to Develop a Proposal" will go here 



96 

ERjC 



Step <f (p 



Come to agreement 
about services 
and responsibilities 



Instructions 



he. U i«« Hvn^ _ 
cuctd cuirfora to 



An ugreftwnt o otJUoia 
fifiluator ni 

mctdju for- Ou ^uv^ma-Huc ecvi 

This agreement, made on , ^9 

describes a tentative outline of the formative- 
evaluation of the m project, funded by 

for the academic year to 

. The evaluation will take place from 
, 19 to , 19_.. The 




fo * m otiv e evaUator for this project is 

assisted by and 



Focus of the Evaluation 

The program staff has communicated its intention 
that the fownaUve evaluator monitor periodically 
the implementation of the following program char- 
acteristics and components across all sites : 



Implementation of the following planned or natural 
program variations will be monitored as well: 



The evaluator will monitor periodically progress 
1n the achievement of these cognitive, awtitu- 
dlnal* and other outcomes: 



The evaluator, in addition, will conduct feasi- 
bility and pilot studies to answer the following 
questions: 



T he evaluator will provide, as well, the following 
services to the staff and planners: 



Data Collection Plans 



Program Monitoring and Unit Test^ng^ 

Data collection for ongoing formative monitoring 
of implementation and progress toward objectives 
will take place during the following periods: 
from to ; from 



to 



and from 



to 



These dates were chosen because 



Interim reports, delivered to 
, will be due on 

, 19_, 

, 19 . 



a id to 
19 . 



IT 



Approxima tely 



program and 



and 



control 



sites for collection of implementation oata will 
be chosen on a (random/volunteer) 



wi 11 be studied Inten- 
sively using a case study method; _ will be 



basis. Of these. 



J I »<- I J -» ' "J — w w >. J — » — m 

examined bv means or observation ind interviews; 

and .11 receive questionnaires or have 

records reviewed only. S*aff members filling the 
following roles will be asked to cooperate: 



Approximately 



program and 



control 



sites will take part in each assessment of prog- 
ress toward urogram outcomes. These will be 
c K osen on a basis. 



During each assessment period listed above, the 
following types of instruments will be adminis- 
tered to students and • 



5 ./; sv BEST COPY 



ERIC 



... ^ 



Pilot and Feasibility Studies 

Pilot and feasibility studn-s will be conducted 

at approximately __ sites, chosen on a 

basis. The purpose and probable 



duration of each study is outlined below- 



Tentative completion dates for these studies are 

? 19 , , 19 , and 

, 197 

and 



19 



with reports delivered to 

on 

TT9 Tand 



19 



The following implementation, attitude, achieve- 
ment, and other instruments will be constructed 
for the pilot studies: 



Staff Participation 

Staff members have agreed to cooperate with and 
assist data collection during monitoring, unit 
testing, and pilot studies in the following ways* 



Fvaluaior's Handbook* 




Indirect CoscS 



ic:\i COSTS $ 



Variance Clause 

The staff and planners of the pro- 
gram, and the evelu3tor, agree that the evaluation 
outlined here represents an approximation of the 
formative services to be delivered during the 
period , 19 to * . 19 , 



Since both the program and tne evaluation are 
likely to change, however, all Darties agree that 
aspects of the evaluation can be negotiated. 



The contract outlined here orescribes the evalua- 
tion's general outline only. If you plan to 
describe cither f he pro^,;am or the cvaluatic in 
greater ucLail , then include tables such a* 
Tables 2 *nd 3 In Chapter 2, paga* 28 and 31. 



Approximately meetings will be needed to 

report and describe the evaluation's findings. 
These meetings, scheduled to occur a few days 
after submission of interim reports, will be 
attended by people filling the following roles: 



The planners and staff have agreed that decisions 
such as the following might result from the for- 
mative evaluation: 



Budget 

The evaluation as planned is anticipated to 
require the following expenditures: 

Direct Salaries $ 



Evaluation and Assistant Benefits S 



Other Oi rect Costs : 

Suppl les anc 1 ^terials 
Travel 

Consultant services 

r quipment rental 

Cot^nunica t ion 

Printing and duplicating 

Data processing 

Equipment purchase 

Facility rental $^ 



Total Direct Costs $ 




AGENDA B 

Select Evaluation Design 
and Appropriate Measures 



meas^re^r' t0 Selec t design Plan analysis Choose 

assess " ; onitorin 8 ea <* Sampling strategy 

assess system instrument 6 



(Many of the steps in Agenda B are still to be combined. 
It will be more efficient after I receive the revised 
Implementation book. I assume the Design book will not 
have changed substantially.) 



O . 0-/9 



9.9 



ERIC 



Sfcft-hvSti ft (inttic ft>r it »//(///c f//rc Stttntttitnvf I i it/tut fiitn 




Set up the evaluation designs 



Instructions 



In Phase B you selected instruments with which to 
carry out mea. urtrmonrs and you cto»*fc an ov.ilu.it ion 
design to determine when — and to whoi - ^ney would 
be administered. The purpose of this step is to 
help you ensure that the design is carried out. 




Un»» ot dee ign and randog •■■inn- 
■eat are created In depth in How To- 
Oglqva Program E validation . ■ lu 



•-rtn i hwfc run ,Uhu rnrd sllii- 

. destp,!^ ynu l i ng i l ms i i i i t* - 



The three checklists which follow are intended to 
help you keep track of the implementation of the 
design you have chosen. Set up the checklist that 
is relevant to your particular design. l>i li u > 
Imap craLk vf nr, jiu una info i tiuu> 



t*ZZ> 



an 



and fn i hm h th e mm,i 1 l c luu u r 



Checklist for a Control Croup Des ign 
Kith Prctext — Pettcjis 1, 2, and j 

I. Name the person responsible for setting up 
the design 



If the design uses a true control group : 

2. Will there he blocking 7 □ yes Q ne 

(See How To Design a Program Evaluation , 
pages U9 and 150.) 

3. If ves, based upon uhat" 
□ dbility Q se - 

I 1 .irin rvt'ircn c otliei 



4. Mas r.'Midomi/.ir too hem completed 7 
I I s PI n n te 



If th, 



non-er|u • vt 1 t/n t ( ontro] %roup: 



Nan* t>*is i-rnti,! 



6. List the najor differences between the pro- 
lan and comparison groups — for example, sex. 
SFS, ability, tino of iMv of class, Geograph- 
ical loi at ion, age: 



7. Has ront.net been mule to secure the coopera- 
tion nf the cnni|i i- iM>n group 7 ^] yes 

ba t .• 

8 . Agreement receive d f rom (Ms. /Mr.) 



9. Agreement wa.-; in the form of ( let ter /memo/ 
p*. r*,unal convorsit i on/etc . ) 



1C. 



Confirmatory letter or memo sent? yes 
Date 



11. Is there a list of students receiving the 
comparison prop.r.iri 7 yes £H 

Where i*> it? 

In either case: 



12. 
13. 
14. 



N.me of pretest 



Pretest completed? Q yes Date_ 



Teachers (or otlu r program i*nplementors) 
warned : 

[""I To avoid confounds? Memo *ont or meeting 
held (date) 

f 1 To a old contamination 7 Memo sent or 
meeting held (date) 



(See How To Desiru a Progrjm Evaluation , 
pag* bOO 

List of possible confounds and contaminations 



16. Check nude that both prorr will «p.m the 



1 /. 



P»<t lest ^tvon • [~~] 



Ditc 




2- zo 



BEST COPY 



98 



Evaluator's Handbook 



t „ - Checklls t for a Tim* S^ ,^ Dositrr, 
—Designs 4 and 5 

2 * Sdmil^d™"" C ° bC -i-i«.™ d ard 



Equivalent form of instruments to be: 
□ Made in-house? Q Purchased? 

pt b i„ r stL r :„ P t. ated to be made 

Dates of planned measurements: 
S 1St □ 5H, 



□ 3rd_ 

□ 4th_ 



Additional : 
□ 



- the desl * n »3es a control p^--^- 
6. Name of control group 



7- List of major differences between M,. 

s 8 eT P SES d MM 0 "" 01 « r °"P =- p e T 8raB 

s«, SES, ability, geographical location, age 



9. Agreement received from (Ms. /Mr.) 

10. Confirmatory 7t»ft- a .- . 

iory ' !tter °r m?mo sent? j~l 

Date 

11. List of possible contaminations 



g^l^Ljg JLPre-Post 
jii£lLiaf°iMLC om g arlsors--De^!l |:n ft 

design' PerS ° n reSponslble f " "tting up 



Caparison to be made between obtained post- 
£SS£. results and jjretest results? Q 
• Name(s) of instrumcnt(s) to be used 



• Equivalent forms of instruments to be: 
[J made Q Pil ^ hased 

# ™fr° f S D tudents receiving Form A on pretest 
and Form B on posttest Pretest 



• Dates of planned measurements: 
Pretest Completed? □ 
Posttest. Completed? (J 

St*? Q t0 bC Mde Via "andardized 

• Name of standardized test(s) 



' T-st given? Q Dat „_ 



4. Comparison to be made between oht-.ln^ 

and result, described in , , tJinod results 
-ials? Q scrlDed in curriculum mate- 

• Name of curriculum materials 



• Unit test results collected and filed? Q 

# o^he'l" reSUUS fr ° m ?r ° g " m 8ra P hcd " 
otherwise compared to norm group? Q 

5. Conparison to be made between results frrm 

Kry ,r and the results -^V£- 

• Which results from last year will be used- 
for example, grades, district-wide tests' 

• Last year's results tabulated and graphedlQ 

• CMs and e i^ f POSSibl V merences ^""n 
eh.! V - venr « Cor last time's) groun 
that mxght differentially affect results" £] 



- Program X's results collected? Q 

• Program X's results scored and graphed or 
otherwise compared, with last ye.Vr's? Q 



101 



BEST COPY 



Step-hv Step Guide ]i>r ('< >ittliirtiiig a Summatn c {■'valuation 



99 



6. Comparison to be made between obtained 
results and prespecified criteria about 
attainment of program objectives? Q 

• Whose criteria are these — for example, 
teachers, district, curriculum developers? 



* State the criteria to be met 



• Objectives-based tesi results collected and 
filed? □ 

• Objectives-based test results graphed, or 
otherwise compared, with criterion? Q 



n 



If you have chosen to administer 
instruments at only a sample of pro- 
gram sites or to a sample of 
respondents, then use the following 
table to keep track of the proper implementation 
of your sampling plan. 



3. 

4. 
5. 



Sampling Plan Checklist 

The sample will ensure adequate representation 
to different types of: 

n Sites — what kinds? 

I | Time periods — which ones? 



[ | Program units — which ones?_ 

[J] Program roles — which ones? 

r 1 ^tjJent or staff characteristics — name them 



D 0th0 



The sampling plan comprises a matrix or cube 
witn . cells (see How To Measure Program 
Implementation , pages 60 to 65) 

How many cases will be sampled from each cell? 

(set .'W To Design a Program 

Evaluation, pagt-s 157-161, for suggestions 
about selecting random samples) 

Cases selected? Q 

For each time selected. 

• Have instruments been administered? FH 
Comments 

• Whnt deviations from the sampling plan h«»„ve 
occurred 7 




z £ 



102 



BEST COPY 



AGENDA C 

Collect and Analyze 
Data According to 
Evaluation Design 



Put Sampling Administer Compile^ Analyze data 

Plan into instruments — Reduce data 
Effect observe, score, 

record 



(Some of the steps in Agenda C are torhe revised further 
when I have received the Implemer tation and the Reporting 
books) 



0 25 ioa 

ERIC 



Administer instruments, 
score them, and record data 



Instruct »ons- 




aeQncc you have dccijed w> j ch InNlrtinen ES to 

use, m K a„ quiring the, nr nrw ^ 

i;ir^ d ? rinS ^ COnStrUlCin S instruments 
will take a ion* time, possibly months. 

If you intend to nu trumcnr;., 

u> ' 1 L,,c U)ra letter*; t„ the various 
^ >k Tn books for orc-riiiR thorn. 
C!,Llk t,,L ' li« *»f Lest pui)W<;i,rrs 
in the MouTo books for sources of published tests. 

I: /r» u plan to construct your own in- 
struments, write a ncro to those in 
charge of producing tl-oa, icavm, n.- 

. llf t UoubC * lDL,uC who is responsible ai:J 
deadlines for their fo-jplet i»»n. 

r Instruments made in-lnn . L must he 

v?_ tn.d out, Ji-bu. ;Ct d. evaluate 

ior technical q„alirv , () 3uJ Uu . 

t process, the mi's mca .urciacnt hoe,.- 

tCV^J cUjSlIuv and ^lidity as tncy npplv to 
h three pr;,...rv ,u,„ r ,„, nt \ 

th. books See Achievement, Chipttr 5, Attitude . 

Chapter 11. and IoMe»cntar ton. Cnnpter 7. A little 

run-through with a few students or .ides -ight mean 

the differemr K-tvr«, a nodio.re instrument and i 
re.. I ,v excell. n t one. 






Keep tabs on in«?t reii nt orders, if 
.•hi h ivo nnt roccM-.od r within 
two weeks of the do.dliiK , pn ,j the 
publisher or your in-lumse developer 






i* Li.**tru<m.nt is <oi V letcd or 




I^£lM_l, nl.., how , t u r i I In stor^l 




* a j rot .>rd. «• : 



-' t>rr '"It? ur . V'^ tl x res-u It 



* r.^.r in 



If tin i nstrutu t.t, h.is a s,.!,./.,.,! „ 
o lost, ,ce. r,Mt.;,t r ch7rc7rrrue.Luo. U , 
scale. ,ke suu- >IU have a scoring k.y u4 tcnpl 1Cl 




If U has an open-ended format, make sure you have 
a set ot correctness criteria for scoring, or a 
way of cntegonzing and coding questionnaire or 
interview responses. 

See How To Me asure Program Tapleaenra- 
f/^ Uoo. page* 71-/ j- and How To Heaaure" 
Attitude*, pages 106 and 107, and 17a 
, , «nd- 171, Th.^o sections contain 

information about scoring or coding open-response 
items assays, and reports. If the test is to he 
scored elsewhere by a state or district office or 
by an agency with whom you have a contract for 
testing and scoring, and VOu nre t0 receivc , 
print-out of the results, decide whether y t ». wish 

to score sections of it ft )r VOu , ovn ff|| , e 

In some raws, achievement or objectives can ', e 
measured via partial scoring of a standard Ued test. 




"nr Tn 



-to 



^ il V'rh C ■ 1 1 ■ am 



To Measu re Achieve— nr . pa*** 3if"»:o 
39 for a description of a technique 




for domg thia 



^ orti rcMrlt^; per meas ure onto t data mirn.rv 

SlU'Lt 1 

l,nu ' y° 1 ' know what the scores fron 
y»»ur instruments will loo* like, 
decide whether you want results'', r 
. eflfn ovamineo. wean result* for each 

wleu S :.°> r r ,n0nt,,ftl ' rCSU,IS U ' r lMCh Then, 
when ca,h histrunent has been .d^.ustercd. score 
the nstruments as soun as possible. 

dn^e scoring is completed, consult 

the .ippriipria'e duw To hooks for 

sun.ostions ah'.ot lormattmg and 

w A „ 4m t J lU ,n « "»« <»n ' summary shifts . 

^erfrr^r ^ 159 Co ">*: l»»Pie«ont7tTon . 
JWfiCS 67 to 71; and Achievement, pagts i l7 to 120 . 

Construct .or data summarv sheets for Pro- 

P "-[!° an<J tm Co ^ ri ^ group *j that it 
is impossble to g^t them confused. Then delegate 
the scoring and recording ta^k:.. 




9 

ERLC 



3 



m BEST COPY 



A table like the following should 
help you keep track of instrument 
development, administration, 
scoring, and data recording. 



Instrument 



Completion/ 
Receipt 
.Deadline 



Administration Deadlines 



post A 



Scoring 
Deadline 



Recording 
Deadlir- 




BBS 2 



105 



ERIC 



Analyze data with an eye toward 
implications for policy and for 
proaram improvement 



Instructions 



If vo» arc periodically monitoring the prop ram — 
and particularly if there is a control group — then 
you have collected a battery of 
general measures that can be ana- 
lyzed using ."airly standard statis- 
tical Methods. Consider whether 
you will: 

n Graph results from the various instruments 

Perform tests of the statistical significance 
of differences in performance among groups or 
from a single group's pretest and posttest 

n Calculate correlations to look for relation- 
ships 



□ 




n Compute indices of inter-rater reliability 



How -to* Hmimre. Achi^vaaeat di«cus*«s 
using, test result* fot ffctrtfetfcal 
■■wtalysis on pages 125 to 145. . How „ 
to Measm*- Attitudes describe* 
attitude test scoter «e(T for~calculating statis- 
tics on pap.es 170 to 177. See, as we?l, tick* To 
Heagorg P ro g r am Imp l — ant etlott- page* 67 to 77. 
Problems of calculating Inter-rater reliability 
arc discussed In nil three books. Specific 
satttristicmi: i«ay>«^gtr<B>cttM>>d. jg Bow* To " 
CalctiJatto^^tr^jp^- - — 

All of tV» Kit's Hov To books contain suggestions 
for building graphs, a nd tables to summarize 
results. Foe <m<h^Mtnm^tiC*yoib&g{^"»ee the 
relevant How to book. Consult » as well. Chapter 4 
of KourTa fr unc » tvmloet lomv Rspor c . . > ' 



When each graph and statistical test 
C r v is completed, examine it carefully 

^ I JSSSL^ and write a one-or-two sentence de - 
scription chat Sucannrites your con - 
clusions from reading the graph and noting the 
results of che analysis. 



S ave the graphs and summary sentences chat seem 
co you to give che clearest picture of the pro- 
gram's impact. These can be used as a basis for 
the Results section of your report. 



ERIC 



2- 




Remember that in addition to des- 
cribing program implementation and 
the progress in development of 

_ skills and altitudes of various 

participants, you nay also need to note whether 
the profcram is keeping pace wit i the time schedule 
that has been mapped out. 

If you have focused data coll e ction on specific 
program units , or If you jrc conduct ing pilot 
tests . then in addition to performing statir'ical 
analyses, consider whether the program has 
achieved each of the objectives in question. 
In particular, examine taese things: 

• student achievement 

• Participants' attitudes about the program com- 
ponent in question 

• The component's implementation . 

Below you will fifd tour cases describing results J 
you might obtain/ with suggestions about what to 
do about each. Determinations of good , poor , and 
adequate performance should bemused on the per- 
formance standards jet i ^Step^j TT' 



Case 1 

• Achievement test results: good 

• Program implementation: nde^tuit e 

• Attitude results: poor 

What to do? Check the technical quality of the 
ismtrafcent Cm* Hov To Measure Attitudes . 

pagcs~*131 to 15lT Find out what is causing bad 
morale : 

I""] Is Che program too easy 7 Pretest students for 
upcoming program units to see if thev have 
already mastered some of the objectives. 

I 1 Is thi.' program too difficult 7 If this com- 
plaint is widespread, try to alleviate the 
pressure of the work. 

I | Is this part of the program d^ll? The response 
to this depends on the student*: an d subject 
matter. T to find motivators for the stu- 
dents, or help teachers to invent ways to make 
instruction more appealing and relevant. If 
minor changes offer no pronisc, the staff is 
convinced of the irportancc of program objec- 
tives, and the rest of the propram seccs more 
interesting, then don't revise. 



106' 



BEST COPY 



Case 2 



• Achievement test results: good 

• Program implementation: poor 

good 



• Attitude results: 



What to do? First ask: 

□ Did the achievement teat and the program com- 
ponent address the same objectives? if not , 
there's your answer! If so . check the techni- 
cal quality of the implementation measure. See 
JfamtfTgjfcfegg P^ogp— laoiamantatica. pages 
129" to T3ST" — ..v, 7? rrrs ^ 

Then ask: 

Q What happened in the program instead of what 
was planned? Make sure that students did not 
learn from the mistakes they made while strug- 
gling through poor instruction. If possible, 
suggest that the instruction that did occur 
become officially part of the program. 



Case 



More than two of the indicators show unsatisfac- 
tory results. In any of these cases, you should 
investigate the cause of the problem and revise 
aa necessary. 



Case 3 

• Achievement test result*: poor 

• Program implementation: good 

• Attitude results: good 

What to do? First ask: 

Q Did students misinterpret test items in some 
wa* t and if so, how? 

Then assure yourself that tne objective underlying 
the test matches the objective underlying the 
instruction. If so, examine the technical quality 
of the achievement test* See* ttpm^wam— nre. 

89 to 115. Wh 



Then ask: 

□ 

Was student performance during program imple- 
mentation good? If so, check whether the 
amount of practice given to students was 
sufficient to allow them to master the objec- 
tive. 

Was student performance on program tasks poor? 
If so, explore whetner sufficient time was 
given for practice and whether students lacked 
prerequisite skills necessary to leam the 
material. You may need to give diagnostic 
tests to locate students* skill deficiencies. 
Check to see whether the instruction itself was 
difficult or confusing. Did students under- 
stand what was expected of them? 



9 

ERIC 



3- 



2.7 



BEST COPY 



107 



AGENDA D 



Report and Con r er* with 
Planners and Staff 



The key to an effective evaluation is 
good communication. Especially in the 
case of a formative evaluation, infor- 
mation about where the program is or is 
not working needs to be timely and 
clearly presented so that appropriate 
changes may be made. Sumraative reports 
must also be timely and carefully 
prepared if they are to have an impact 
on policy decisions. 




10S 




ERIC 



Instructions 



Tou want Co get che information across quickly and 
succinctly. Therefore think about each Inscruss^** 
you have administered and; 



□ 



Make a graph or table summarizing :he major 
quantitative findings you want to 
repo r t . How To Calculate Statistics 
pages 18 co 25, describes how to 
graph test scores. 




□ 



isr frmiitaTlnn Report, 




Lmm£2, for suggestion* 

abouc organizing your message and 
f// tf «n outline of an evaluation report. 
Lc^k over che outline and decide 
vhich of whe colics apply co your reporc. l r 
you will need to describe program xmplemenca- 
tion, look ac the report JmCUne dm Chsptmr v2) • 
oU How To- 1 



□ writ 

CO 



a quick general outline of what you plan 



noc develop jmetliing chey promised? Why? How 
is develops \t of these particular materials 
progressing ? Is it m schedule? Behind? Why? 
Does ihe scaff plan co cacch up by year's end, 
or is this unnecessary because chey are w^ll 
ahead of student progress? How much of Che 
intended materials development will be com- 
pleted by che end of the evaluation? 

[H "hat staff development and training have beet, 
provided co ensure that planners, teachers, 
etc., are equal to che casks of both designing 
and implementing a new program? 

I | Whac plans for scaff member participation in 
materials development are contained in the 
proposal? Is this an accurate description of 
what has occurred so far? 

| | Whac scaf f-comaunity interchanges to gather 
help wich planning were* mentioned in che pro- 
posal? Whac scaff meetings— within che project 
or with staff members oucside iC — were planned? 
Did chese occur? What were their purposes and 
outcomes? 



If you are submitting an interim reporc tor a pro- 
gram chat is being assembled from scratch , you 
should include in Scccion II of the report a few 
paragraphs dealing wich progress in program 
dc?ign. They might be entitled, for instance. 
Materials Production or Scaft Development . The 
paragraphs should address «hese questions: 

PI Has research been conducted co determine the 
sore of curriculum that is appropriate co Che 
program? Who conducted thi^ research? How 
useful has it been? 

| | Whac materials development has been promised 
for che program? For which objeccives? For 
which sices? Whac st udent materials? Any 
ceacher manuals? Any teacher C raining mace- 
rials? Any <^udiovisuals? Has che staff 
promised to expand or revise something pre- 
viously existing? Did Chey submit in Che 
proposal an outline, plan, cr prototype of 
The promised -acerlals? Are che materials 
being produced in accordance with this? Have 
there been changes? Has Che staff decided to 



109 BEST COPY 




1/ V 



Step 2 

Choose a method of presentation 



Instructions 



If the manner of reporting was not negotiated 
during Agenda A, decide whether your report to 
each audience will be oral or written, formal or 
inf rmal. 

Chapter 3 of How To Pre— at 
Eyaluattog* TUporc -llTA c sec of 

pointers to help you organize what 
you intend to say and decide how 
co say it. 




Step 3 
Assemble the report 



Instructions 




Follow the outline described in 
Tfte report should indTidfcT* - 



• A description of why y> u undertook the evalua- 
tion 

• Who the decision -ma ers were 

• The kinds of q fm M « ti » ^ questions >ou intended t 
ask, the evaluation designs you used, if any, and 
the instruments you used Co measure implementa- 
tion, achievement, and attit ;des 



.data collectiv methods which you 



used 



If you have found instruments which v^re particu- 
larly useful, or sensitive to detecting the 
implementation or effects of the particular pro- 
gram, put them in an appendix. 

^report should conclude, importantly, with 
suggestions to the su*nmative evaluator. If Indeed 
a summatlve evaluation of this particular program 
. will be conducted. 



A worksheet" like the one below will 
help yoc to record your decisions 
about reporting ,ind to keep track 
of the progress of your report. 



Final Report Preparation Worksheet 

List the audiences to receive each report, 
date report* are du.*, and type of report to 
be given to each audience. Some reports may 
be suitable foi more than one audience. 



Audience 



Date report due 



2. How many d if te rent reports will you have to 
prepare? 



3. For each different report you submit, complete 
this section: 

Report 11 AuUience(s) 




3 



Hi) 



BEST COPY 



Checklist for Preparing Evaluation Report : 

• Report will be: Q formal Q Informal 

I | oral Q written 

• Deadline for finished draft 

Completed? QJ 

• Deadline for finished audio-visuals, if any 
Completed? Q 

• Deadline for finished tables and graphs 

Completed? Q 

• Names of proofreaders of 'inal draft, audio- 
visuals , or tables 



Contacted 


and 


agreement 


made? 


□ 


Contacted 


and 


agreement 


made? 


□ 


Contacted 


and 


agreement 


m.:.ie? 


□ 



• Date igreed upon as deadline for getting 
drafts to proofreaders. These are absolute 
deadlines for completing drafts: 

Draft sent? Q 

Draft sent? Q 

Draft sent? Q 

• Dates drafts must be received in order to 
revise in time for final report deadlines: 

Proofread draft received? Q 

Proofread draft received? 

Proofread draft r v reived? I I 



This is the end of the Stcp-by-Step Guide for 
Conducting aj^ r n 11 Evaluation , By now eval- 
uation is a Tamilian topic to you and, hopefully, 
a growing interest. This guiJ i is designed to be 
used again and again. Perhaps you t/Ul want to 
use it in the future, each tine trying a noro 
elaborate design and more sophisticated measures. 
Evaluation is a ntj field. Be assured that 
people evaluating programs — yourself included — 
are breaking new ground. 



BEST COPY 

5 '// 111 



Chapter & 



Step-by-Step Guide 
For Conducting a 
Small Experiment 



The celf-contained guide which comprises this chapter will 
be useful if you need a quick but powerful pilot test -or a 
whole evaluation-of a definable short-term iroyram or 
program component. The guide provides start -t onrush in- 
structions and an appendix containing a sample evaluation 
report. This step-by-step guide is particularly appropriate 
for evaluators who wish to assess the effectiveness of spe- 
cific materials and/or activities aimed toward accomplishing 
a few specific objectives. 

If a major purpose of the program you are rotating is 
to produce achievement results, this guide outlines an idee! 
way to find out how good these results are: conduct an 
experiment. For a period of days, weeks, or months, give 
students the program or program component you wish to 
evaluate while an equivalent group, the control group, does 
not receive it. Then at the end of the period, test both 
groups. This step-by-step guide shows you hov to conduct 
such an evaluation. 

Whenever possible, the step-by-step guide uses checklists 
and worksheets to help you keep track of what you have 
decided and found out. Actually, the worksheets might be 
better called "guidesheets," since you will have to copy 
many of them onto your own paper rather than use the one 
in the book. Space simply does not permit the book io 
provide places to list large quantities of data. 

As you use the guide, you will come upon references 
marked by the symbol These direct you to read 



sections of various How To books contained in the Program 
Evaluation Kit. At these junctures in the evaluation, it will 
be necessary for you to review a concept or follow a 
pr ~*dure outlined in one of the Kit's seven resource 
boo as: 

• How To Deal With Goals and Objectives 

• How To Design a Program Evaluation 

• How To Measure Program Implementation 
~ How To Measure Attitudes 

• How To Measure Achievement 

• How To Calculate Statistics 

• How To Present an Evaluation Report 

Should You Be Using This Step-By-Step Guide? 

The appropriateness of this guide depends on whether or 
not you will be able to set up certain preconditions to make 
the evaluation politic. Cnsck each of the preconditions 
listed in Step 1. If you can arrange to meet all of them, 
then you can use the evaluation strategy presented ir> this 
guide. As you assess the preconditions, you will be takii.g 
the first step in planning the evaluation. This step-by-step 
guide lists 13 steps in all. A flow chart showing relation- 
ships among these steps appears in Figure 5. You may wish 
to check off the steps as they are accomplished. 



J 



ASSESS 
PRECON- 
DITIONS 



AND 
CONPEP 



RECORD 
THE EVAL- 
UATION 
PLAN 



PREPARE 

— OR SELLTTf— 
THE TESTS 



REP ARE 

LIST 

jor sto- 

lENTS 



crvi 

THE 
PRETEST 



FORM THE 
EXPERI- 
MENTAL 
AND 

CONTROL 
CROUPS 



AkRANCE 
FOR YOUR 
IMPLEMEN- 
TATION 
MEASURES 



j-E-groutr 



RUN THE 

P ROC RAW 
ONE 
CYCLE 



10 



POSTTEST 
THE 



-CROUP HTHE 
AND 

C-CROUP 



ANALYZE 



12 



IF YOUR TASK 
IS FORMATIVE 
-HEET Kl'M THE}- 

sr^rr to dis< 

HJSS RESULTS 



WRITE A 
REPORT IT 
NECESSARY 



Figure S. The steps tor accomplishing a small experiment, listed ir this guide 




4-f 



112 BEST COPY 



108 



Step 1 
Assess Preconditions 



Instructions 



□ PRECONDITION 3. 
identified. 



A time period — a cycle—can be 




Put a check in each box if the pre- 
condition can be mer .. For the first 
three preconditions,, there are some 
decisions to be recorded on the 
lines provided. Record these decisions in pencil 
since you may change them later. This step-by- 
step guide will be useful to you only if you can 
meet all five preconditions . 



You csn identify s time period which is of s 
duration appropriate to tesch the skills the 
outcome measure taps* Call this period of time 
one cycle of the program. Write down what 
length of time one cycle of the program will 
probably last; 



□ 



PRECONDITION 1. 
available. 



An outcome measure will be 



A test can be made ->t selected to measure what 
• adenta are supposed to learn from the program, 
kfite down what the outcome measure(s) will 
probably be: 



□ 



PRECONDITION 2. 
defined. 



A sample of cases* can be 



You can list at least 12, say, students for whom 
this program uould be suitable and for whom, 
therefore, the outcome measure is an appropriate 
test of what they learned in the program. Write 
down the criti ris that will be used to select 
"tudents for tie sample: 



n PRECONDITION 4. ji e , erimental group and a 
control group can be set up . 

For one cycle at least, one group of students 
in the sample will get the pre? ram and another 
will not. If the program can run through 
several cycles, this does not mean that some 
students will never get the program, just that 
they muat wait their turn. In this way, no 
students are left out — a concern which somrtimea 
makes people unwilling to run an experiment. 



□ PRECONDITION 5- Students who are to get the 
program can be randomly selected . 

The students who are to get the program during 
the experimental cycle will be randomly selected 
from the sample. 



-A case is an entity producing a score on the 
outcome measure. In educational programs, the 
cases of interest are nearly always students — 
though they could be classrooms, school dis- 
tricts, or particular groupa of people. The 

student is used throughout the guide. If 
the cases in your aituation are different, juat 
substitute your own term. 



If each of the five precondition' listed above can 
be met, then you will be able to run ? true experi- 
ment. Thi* is the best test you r.sr make of the 
effectiveness of the program or program component 
for producing measursble results . 



wit BEST COPY 



Step-by-Step Guide for Conducting a Small Experiment 



Step 2 



Meet and Confer 



Instructions 




This step helps you work out a number of practical 
details that must be settled before ; ou can com- 
plete your plans for the pilot test or evaluation. 

You will need to meet and confer 
with the people whose cooperation 
you need and, possibly, with members 
-J of other evaluation audiences. You 
will neea to reach agreement with them about: 

n How the study should be run 

n How to 4. -ntify students for the program 

f~l What program the control group should receive 

Q The appropriate outcome measure 

I | Whether to use additional measures 

I"") What procedures will be used to measure imple- 
mentation 

I | To whom resul ts will be reported — and how 



How Should the Study Be Run ? 



is to be used 
students will 



In particular, aro students to 
receive the program in additic n to 
regular instruction or instead " of 
regular instruction? If the program 
in .^Ition to regular instruction, 
i*e to be pulled out for the pro- 
gram sometime other than the regular instruction 
period. A means of scheduling will need to be 
agreed upon. 




How Should Students Be Identified for the Sample ? 

It might be that the sample will 
simply be all the students in a cer- 
tain class or claaaes. Or*, the other 
hand, pt*haps the urogram is in- 
tended only for students who have a certain need 
or meet some criterion. In this cise, you will 
need to agree upon clear s election criteria . If 
the program is remedial, selection might be based 
on low scores on a pretest, or you might use 
teacher nominations. Teat scores for selection 




are preferred if the outcome measure is to be a 
teat. The problem with basing selection on an 
existing set of test scores is that they might be 
Incomplete; scores might be missing for some stu- 
dents. You could use the outcome measure as a 
selection pretest. 



How To Design a Program Evaluation , 
pages 35 and 36, discusses selection 
tests. See also How To Measure 
Achievement , pages 124 and 125. 



Kow many students will you need? The more the 
better, but certainly y*u should avoid ending up 
with fewer than six pairs of students, a total 
of 12. If during the program cycle, one student 
in a pair la absent too often or fails to take the 
post test, the pair will have to be dropped from 
the analysia. The longer the cycle, the more 
likely it is that you will lose pairs in this way. 
Bearing this in ^ind, be sure to select a large 
enough sample. If it looks as if the sample will 
be too small — perhaps because the program has 
limited materials— you should abandon an experi- 
mental test or run the experiment several times 
with different groups each time and then combine 
results to perform a single analysis. 



What Program Should the Control Group Receive? 



If one group of students will get 
the program and a control group will 
not, the queatlon arises about 
exactly what should happen to the 
control group. Should the control group receive 
no instruction in the subject matter to be taught 
by the program? For example, if program students 
leave the .lassroom to work on computer assisted 
instruction in fractions, should the control stu- 
dents receive instruction in fractions as well, 
or should they spend their time on something else 
altogether? 

It is best to set up the experiment to match the 
way in which the program will be use<* in the 
fu^u -e. If the program wiU be uaed aa an adjunct 
to regular instruction, then set up the experiment 
so that the experimental group gets the program 
in addition to the regular program. If the pro- 
gran, on the other hand, is a replacement for 
regular instruction, then the control group will 
get cily regular instruction and the experimental 



114 BEST COPY 



110 



Evaluator's Handbook 



group will get only the prog ran. If you are 
Interested In assessing the effectiveness of two 
separate programs, either of which might replace 
the regular one, then give one to the experimental 
group and one to the control. 



How To Design a Program Evaluation 
discusses what should happen to con- 
trol groups ->n pages 29 to 32. 



What Outcome Measure — Post test — Is Reasonable for 
Detecting the Effect o f One Cycle of the Experi- 
ment? 






The posttest must meet the require- 
ments of a good test. It should 
therefore be: 



• Adequately long to have good tnl lability 

• Representative of all the relevant objectives of 
the program, to demonstrate content validity 

• Clearly understandable to the students 



A good posttest is essential . 
Whether you plan to purchase It or 
construct It yourself, refer to 
How To Measure Achievement. 



Do Tou Heed Other Measures In Addition to the 
Outcome Measure? 

Will the posttest provide a suffi- 
cient basis on which to judge the 
program? If the posttest contains 
many lt«ms which reflect specific 
details of the program — special vocabulary, for 
Instance, or math problems that use a particular 
format — then a high posttest score may not repre- 
sent much growth In general skills. In such a 
case, you might want to use an additional posttest 
for mea jrlng achievement that contains more 
general Items. 

Since an Immediate posttest will meagre the 
initia l impact of a program, you may wish ' 
measure retention by administering another test 
some time later. Tou may, In addition, need to 
measure other program outcomes such as the atti- 
tudes of students, parents, or teachers. 



See How To Measure Achievement and 
Hew To Measure Attitudes 






See How To Measure Program Imple- 
mentation. 



Which Groups of People will Be Informed About the 
Results? 



Check relevant audiences: 



Q Teachers of students 
Involved 

I | The program's planners 
and curriculum designers 

I I Other teachers 

I | Principals 

n District personnel 

I I Parents of students 
Involved 



I | Other parents 

| Board members 

["1 Community groups 

n State groups 

I I The media 

n Teachers' organi- 
sations 



Do meetings need to be held with any of these 
groups, either to give Information or to hear 
their concerns, or for both reasons? 

□ Yes □ No 

If yes, told such meetings. 



Tou and the others Involved have now 
finished deciding how to do the 
evaluation. Once these decisions 
are firm, go back to Step 1 and 
change the preconditions entries you made there 
If necessary. 




What Procedures Wil? Be Used for Measuring Program 
Implementation? 



As the program runs through a cycle, 
a record should be kept of which 
students actually participated in 
the program and which students— 
perhaps because of absences — did not. Tou must 
also keep careful track of what the experiences 
of program and control students looked like. 



o 

ERIC. 



4 115 



BEST COPY 



Step-by-Step Guide /I r Conducting a Small Experiment 



Step 3 



Record the Evaluation Plan 



Instructions 



Construct and complete a worksheet like the one 
below, summarizing the decisions made during 
Step 2. Contents of the worksheet can be used 
later as a first draft of parts of the evaluation 
report. 

If two programs or components are being compared* 
and each is equally likely to be adopted, then you 
will have to carefully describe both. 




PROGRAM DESCRIPTION WORKSHEET 



This worksheet is written in the past tense so 
that when you have completed it you will have a 
first draft of two sections of your report: those 
that describe the program and the 
evaluation. For more specific help 
with deciding what to sly, consult 
How To Present an Evaluation Report . 




Background Information About the Program 
A. Origin of the Program 




D. Students Involved in the Program 



E. Faculty and Others Involved in the Program 



Purpose of the Evaluation Study 
A. Purposes of the Evaluation 



C. Characteristics of the Program- — materials, 
activities, and administrative arrangements 



erIc— 



lib BfiSTcu^y 



112 



Evalua tor 's Handbook 



B. Evaluation Design 

A p-e teat-post test true experiment was used to 
assess the Impact of the program on atudent 
achievement. The target sample consisted of 
all who (fill in the selection criteri* here) 



Experimental and control groups were formed by 

random selection from pairs of students 

matched on the basis of the pretest. 

C. Outcome Measures 



D. Implementation Measures 



Once you have completed the Worksheet , you have 
prepared descriptions of the program and of the 
evaluation. These descriptions will serve as your 
first dr«.ft of the evaluation report. 



** 117 BEST COP? 



Step-b^Step Guide for Conducting a Small Experiment 



Step 4 ~] 

Prepare or Select the Tests 



Instructions 



The Pretest 

Jse one of three kinds of pretests: 

• A test to identify the sample of students eli- 
gible for the program — this is a selection test 

* A test of ability given because you believe 
ability will affect results , and you therefore 
want the average abilities of the experimental 
and control groups to be roughly equal 



• A pretest which is the 



as the posttest . 



or its equivalent, so that you can be sure that 
the posttest shows a gain in knowledge that was 
not there before 

In most cases, the pretest should be the poattest 
or the outcome measure itstlf . If this will be 
possible in your situation, then produce a thor- 
ough test which will be used as both pretest and 
posttest. 



Preparing the Pretest Yourse lf 

How To Measure Achievement . Chap* 
tel. 3, lists resouices, item banks, 
and guides *-o help you construct a 
test yourself. How To Measure 
Attitudes gives step-by- step directions for con- 
structing attitude measures of all sorts. 

Once the Lest has been written, try it out with a 
small sample of students to ensure that it is 
understandable and that it yields an appropriate 
pattern of scores for a pretest — not too many high 
scores so that there is room at the top for stu- 
dents to show growth. The tryout students should 
not be students who will be assigned to either the 
exper .aencal or control groups. You will need at 
least live students for the tryout. They should 
be as similar as possible to the students who are 
to receive the program. You might need to borrow 
students from another class or school. 




Eta 



deck off these substeps in test 
development as you accomplish them: 



n Test has been drafted oj selected 

f"l Test has been tried out with a small group of 
student 8 

n Results of the tryout have been graphed and 

examined. Consult Worksheet 2A of 
How To Calculate Statistics for help 
with graphing scores. 

{"") Test has rsen revised, if necessary 

PI Test has been reproduced in quantity ready for 




If you intend to use the pretest you 
have purchased or written for 
selection of students, then you 
will, of course, have to administer 
the test before you decide which students are 
eligible. In this case, complete Step 6 before 
Step 5. 

If the pretest v~Lll be administered to program and 
control groups after the groups have been formed, 
the** go on next to Step 5- 



9 

ERIC 



118 



BEST copy 



Step 5 

Prepare a List of Students 



Instruction* 



Your sample list might look like this: 



List all students far whoa one cycle 
of the program will be appropriate. 
In order to construct this list, you 
must have a set of criteria for 
These should havi been established in 



selection . 

Step 1 and recorded on the worksheet in Step 3. 



Write the names of the students who meet the 
selection criteria down the left hand side of the 
paper. Call this a sample list . 

If you are using th^ selection test as a pretest 
as well, list students in order by score , from 
highest to lowest, and record each student's score 
next to his or her name. 



SAMPLE LIST 

Adams, Jane 
Bellows, John 
Cartwright, Jack 
Dayton, Maurice 
Dearborn, Fred 

Eaton, Susie 
James, Alice 
Harkham, Mark 
Payne, Tom 
Pine, Judy 

Taylor, Harvey 
Vine, Grace 
Washington, Roger 
Williams, Greg 



Step 6 
Give the Pretest 



Instructions 

\ 

It is best to give the pretest at one sitting to 
all students concerned. Be sure no copies of the 
test are lost. All tests handed out must 
returned at the end of the testing period. For 
obvious reasons, this is critical if the test will 
be used again as a post test. 

Tests are more likely to get lost when they use a 
separate answer sheet which is also collected 
separately. If your test uses a separate answer 
sheet, then have students place answer sheets 
inside the test booklet, and collect the two 
together. 



4- « BEST COPY 

Hit 



Step*by~Step Guide for Conducting a Small Experiment 



Ster 7 



Form the Experimental 
and Control Groups 



Instructions 



Record pretest scores on the sample list if 
you have not already cone so. 



Graph the protest scores 




Refer to Worksheet 2A of How To 
Calculate Statistics for help with 
this step. 



Are the scores appropriate fcr a pretest? 
Thaw is, are scores relatively spread out with 
few students achieving the maximum 7 If yes, 
continue. 

If the test was too easy, prepare and give 
another test with more difficult i tarns. The 
program's instructional plans might need revi- 
sion too if a test well-matched to the 
program's objectives was too ?asy for the 
target students. 



£ # Rank order the students according to pretest 
scores 

If it is not already arranged according to 
student scores, rewrite the sample list 
starting with the student with the highest 
score and working down to the lowest. 



Cl» Form "matched" pairs 



Draw a line under the top two students, the 
next two, and so on. 



Bellows 


38 


Eaton 


36 


Adams 


35 


Dayton 


35 


James 


35 


Payne 


32 


Dearborn 


31 


Vine _ ^ 





Q 9 From each pair^randomly assign one student t o 
the experimental group and the other student 
to the control group 

To accomplish the random assignment, toss a 
coin. CaJl the experimental group or E-group 
"heads" and the control or C-group "tails." 
If a toss for the first person in the first 
pair gives you heads, assign this person to 
the E-group by putting an JE by his name. His 
match , the other person in the pair, is then 
assigned to the C-group. If you get tails, 
the first person in the pair goes to the C- 
group and the other to the E-group. 

Repeat the coin toss for each pair, assigning 
the first person according to the coin toss 
and his match to the other group. If there is 
an odd number of students, just randomly 
assign the odd student to one or the other 
group, but do not count him in the analysis 
later. 



Prepare a Data Sheet 

Have a list of the E-group and C- group stu- 
dents typed on a Data Sheet . This sheet 
should place the E-group at the left-hand side 
with a column for the posttest scores, then 
the C-group ;>nd the score column at the right. 
Always keep matched pairs on the same row . 
Columns 5, 6, and 7 will contain calculations 
to be performed later. 



DATA SHEET 



t 

E-group 


2 

Post- 
test 


3 

C-group 


A 

Post- 
test 


5 
d 


6 

(d-d) 


7 

(d-d) 2 

















120 BEST Copy 



116 



Instructions 



Ensure that the program has been implemented as 
planned. This means ensuring that the students 
who are supposed to get the . ogram — the E-group 
— do get it, and the others — the C- group — do not. 

To accomplish this, try the following: 

• Work closely with teachers to assure that the 
program groups receive the program &t the 
appropriate times. Arrange a plan for care- 
fully monitoring student absences from the 
program. 

• Set up i record-keeping system to verify imple- 
mentatijn of the program. For example, students 
could f ign a log book as they arrive for the 
program, or perhaps they could turn in their 
work ifler each session. In addition, if pos- 
sible , plan to have observers record whether the 
prop ram in action looks the way it has been 
described. 

Refer back to the worksheet in 
Step 3 (Implementation Measures ) 
to review your decisions on how to 
measure program implementation. 

Check How To Measure Program Imple- 
m entation for suggestions about 
collecting information to describe 
the program. 



Step 8 

Arrange For Your 
Implementation Measures 





i2i BEST COPY 



Stepby-Step Guide for Conducting a Small Experiment 



Step 9 



Run the Program One Cycle 



Instructions 



Let the program run as naturally as possible, bu" 
check that accurate records are kept of the stu- 
dents' exposure to the program. 



Be careful. If teachers or the 
evaluator pay extra attention to 
experimental group students, this 
alone could cause superior learning 
So be as unobtrusive as possible. 




from th 



Step 10 



Posttest the 
E-group and C-group 



Instructions 



Give the posttest to the experimental and control 
groups at one sitting, if possible, so that 
testing conditions are the same for all students. 
If oue sitting is not possible, test half the 
experimental group along with half the control 
group at one sitting and the others at a second 
sitting. 

Of course, some of your outcome measu s might not 
be tests as such. Interviews, observations, or 
whatever, should also be obtained from the experi- 
mental and control groups under conditions that 
are as similar a r possible. 

If necessary, schedule make-up testa for students 
absent from the posttest. 



BEST COPY 



on 



118 



Step 11 



Anal>ze the Results 



Ins true t ions 



fl t Score the Post tests 

If the test you have constructed yourself con- 
tains closed response i teas— for example, 
Multiple choice, true-false— then you can 
delegate someone to score the tests for you. 

How To Measure Achievement , pages 
117 to 120* contsins suggestions 
for scoring and recording results 
from your own tests* 




exceeded s tolerable amount for inclusion in 
the experiment. For every student dropped, 
the corresponding control group match will 
have to be dropped also. Drop as veil sny 
s' ident for whom there is no postteSt score . 
Drop his match alao. 



C Summarize Attrition 

Summarize results from pruning of the data in 
the table below. The number dropped from each 
gvoup is called its "mortality" or "attri- 
tion." 



Check the Data Set and Prune as Necessary 

Use the Sample Lif to complete this procedure: 

Check for absences from the program. If some 
students in either the experimental or control 
group missed s lot of school during the pro- 
gram's experimental ~ycle, they ? should be 
dropped from the sample. You and your audi- 
ence will have to agree about how many 
absences will require dropping the student 
from the analyaia. On* day's absence in a 
cycle of one week would probably be aignifi- 
cant since it repieaents 20X of program time. 
A week' a absence in s six month program, on 
the other hand, could probably be ignored. 

i.Z you decide that atudents in the experimental 
group ahould be dropped from the analysis if 
their sbsences exceeded, say, aix daya during 
the program, then control group students 
absent six or more daya ahould alao be dropped. 
This keeps the two groups comparable in com- 
position. If the c itrol group received s 
program representing a critical competitor to 
the program in question, then control group 
abaences ahould be noted as wall and the 
Sample List pruned accordingly. 

From attendance records, determine 
the number of da/a each student was 
abaent during the program cycle. 
Record this information in appro- 
priately labeled columns added to the Sample 
List. Drop sll students whose sbsences 



TABLE OF ATTRITION DATA 
Number of Students Remaining in the Study 
After Attrition for Various Ressons 





Experi- 
mental 
Group 


Control | 
Group 


Number sssigned on basis of 
pretest 






Number dropped because of ex- 
cessive absence from school 
during program 






Number dropped from E-group be- 
cause of failure to receive 
program although in school 






Number dropped because of lack 
of poattest score 






Number drooled because match 
waa dropped 






Number retained for analysis 







Record Posttest Scores on the Data Sheet for 
* Students Who Have Regained in the A n * lysis 



3 - Eifc . 



4- 



123 



BEST 



cow 



S'"p-b} Step Guide for Conducting a Small Experiment 



119 



Instructions 




Test To See if the Difference in Posttest 
Scores Is Signific&nt 

Were you to record just any two sets of post- 
test scores, it is likely that orn of the 
groups would have higher scores than the 
other just by chance. What you now need to 
isk is whether the difference you will almost 
ine Itably find between the E- and C-grou^ 
posttest scores is so slight that it could 
have occurred by chance alone. 

The logic underlying tests of sta- 
tistical significance ij described 
in How To Calculate Statistics . In 
fact, pages 71-76 of that book dis- 
cuss the t-test for matched group* , to be used 
her*;, in detail. 

To decide whether one or the other has scortd 
sign can tly higher in this situation, you 
will use i correlated t-test — c*i related 
because of the matched p»irs used to form the 
two groups. Using your data, you will calcu- 
late a statistic, t. You will then 
compare this obtained value of t 
with values in a table. If your 
obtained value la bigger than the 
one in the table, the tabled t-value . then you 
can reject the idea that the results were just 
due to chance. You will have a statistically 
significant result. Below are the steps for 
this procedure. 



Use the data sheet from Step 7 to help you cal- 
culate quantitlea for the t equation. 

DATA SHEET 



1 


2 


3 


4~ 


|5 


6 


7 




Post- 




Post- 








E-group 


test 


C-group 


test 


d 


(d-d) 


(d-d) 2 

















Page 126 shows a data sheet that has been computed. 



To compute d. First find the difference between 
the scores on the o^sttests for each pair of stu- 
dents . The difference, d, for a pair is the 
quantity: 



posttest score 
of the E-group 
student 



posttest score 
the marched O 
group studen 



re of) 

? J 



Note that whenever a C-group student has scored 
higher than an E-group student, the difference i*. 
a negative number. Record these differences In 
Column 5 of the Data Sheet. 

Then add up the entries in Column 5 and divide 
that sum by the number of pairs being used in the 
analysis, n. This gives you the average differ- 
ence between tht E-group and C-group. Call It cF, 
re^d "d bar." 



Steps for Calculating and Testing t 
Calculate t 

This is the formula for t : 



t - 



In order to calculate it, you need to firs*, com- 
pute the three quantities in the formula: 

d ■ average difference score 

«^n ■ the square root of the number of matched 
pairs 

s rf » the standard deviation of the difference 
scores 



To compute s^ . Fill In the quantities for 
Columns 6 and 7. For Column 6, subtract 1 trom 
each value in Column 5, and record the result. 
For Column 7, square each number in Column 6 and 
divide their sum by n-1, the number that is one 
less than the number of pairs. Take the square 
root of your last answer and record this below 
as s d . 



To compute y£ . Take the square root of the ntanber 
of matched pairs—not the number of studenta — 
which you are using in the analysis. This ^n. 
Enter it here: 



o 

ERIC- 



BEST copy 



/3 



124 



120 



Evaluator's Handbook 



Instructions 



To compute t . Now enter these values In the 
formula for t below: 

t - w . 



Multiply the top line. Then divide the result 
by s. to get your t-value. Enter it here: 



■ obtained t-value 



find the Tabled t-value 

Using the table below, go down the left-hand 
column until you reach the number which is equal 
to the number of matched pairs you were analyzing. 
Be careful to use the number of pairs , not the 
number of students. 



Table of t-Values for Correlated Means 



Number of 


Tabled t-value for 


matched 


a 10Z probability 


pairs 


(one-tailed test) 


6 


1.48 


7 


1.44 


8 


1.41 


9 


1.40 


10 


1.38 


11 


1.37 


32 


1.36 


13 


1.36 


14 


x.35 


15 


1.34 


x6 


1.34 


17 


1.33 


18 


1.33 


19 


1.32 


20 


1.32 


21 


1.32 


22 


1.32 


23 


1.32 


24 


1.32 


25 


1.31 


26 


1.31 


40 


1.30 


120 


1.29 



"Tie t-value in the left-hand columr that corres- 
ponds to the number of matched pairs is your 
tabled t-value. Enter it here: 



Interpret the t-test 

If the obtained t-value is grester than the tabled 
t-value, then you have shown that the program sig- 
nificantly improved the scores of students who got 
it. If your obtained t-value is less, then there 
is more than a 10Z chance that the results were 
just due to chance. Such results are not usually 
considered statistically significant. The program 
has not been shown to make a statistically signif- 
icant difference on this test. 

The test of statistical significance which you 
have used here allows a 10Z chance that you will 
claim a significant difference when the results 
were in fact only due to chance. If you want to 
make a firmer claim, use the Table of t-values in 
Appendix B. This table allows only a 5Z chance of 
making such an error. 

A good procedure in an" case Is to repeat the pro- 
gram arother cycle and a 4a in perform this evalua- 
tion-by-experiment — only this time, use Che SZ 
table to test the results. If your results ar* 
again significant, you will have very strong 
grounds for asserting that the program makes a 
statistically significant difference in results 
on the outcome measure. 



Construct a Graph ot Scores 

If results were statistically significant, display 
them graphically. Figures A and & present two 
appropriate ways to do this. Figure A requires 
fewer calculations. 



Mean 
score on 
post test 



E-group C-group 

Figure A . Posttest means of 
groups formed from matched pairs 



tabled t-value 



ERIC 



1 Jt] m BEST COPY 



Step-by ^tep dude for Conducnrg a Small Experiment 



Instructions 




1 — l 1 — 

Pretest Post;: est 

Figure B . Pretest and r>osttest mean 
scores of experimental arl control 
groups 



You may wish to take a closer look st the results 
than just examining averages or single tests of 
significance* Taking a close look will further 
help you interpret results. In particular 9 if 
the results were not statistically significant, 
you may want to look for general trends. 

One good way to take a closer look at results is 
to compute the gain score — post test minus pretest 
— for each student. Using gain scores, you can 
plot two bar graphs, one shoving gain scores in 
the experimental group and the oth'r shoving gain 
scores in the control group. If some students' 
scores were quite extreme , look into these cases. 
Perhaps there was some special condition* such as 
illness or coaching, which explains extreme scores. 
If so, these students' scores should be dropped 
and the t-cest for differences in post test scores 
computed again. 



1?2 



Step U 

If Your Task Is Formative, 
Meet With the Staff 
To Discuss Results 



Instructions 



The agenda for this meeting should have an outline 
something like the following: 



Introduction 

Review the contents of the worksheet in Step 3, 
pa^es 111 and 112. 



Presentation of Re Jul t 9 

Display and discuss the attritior tabJe which 
describes student absences from the experimental 
and control groups. Display Figures A and B and 
discuss them. Report the results of the test of 
significance. 



Discussion of the Results 

If the difference was significant as hypothesized 
— the E-group did better than the C-group — you 
will need to answer these questions: 

• Was the result educationally significant? That 
is. w&s tha difference between the E-group and 
the C- group large enough to be of educational 
value? 

• Were the results heavily influenced by a few 
dramatic gains or losses? 

• Were the gains worth the effort involved in 
implementing the program? 

If th» results were non-significant, you will need 
to consider: 

• Do you think this was due to too short a time 
span to give the program a fair chance to show 
it* effects, or was the program a poor one? 

• Were there special problems which could be 
remedied? 

• Was the result nearly significant? 

• Should the program be tried again, perhaps with 
improvements? 



Rccommendat ions 

On the basis of the results, what recommendations 
can be made? Should the program be expanded'.' 
Should another evaluation be conducted to get 
firmer results — perhaps using more students? Can 
the program bs improved? Could the evaluation be 
improved? Collect and discuss recommendation. 



ERIC. 



BEST COPY 



127 



Step*by-S:cp Cuidt for Conducting a Small Experiment 

Step 13 



Write a Report if Necessary 



Instructions 



Use as resources the book How Tj 
Present an Evaluation Report and tlie 
worksheet in Step 3 of this gu'de. 
The worksheet you wili remember, 
contains an early draft of :he sections of the 
report that describe the r^ograa and the evalua- 
tion. 




You have reached the end of the Step-by-Step Guide 
for Conducting a Small Experiment. The gjide, 
however, hai tu t •fftm ii uut ^f 

■ AuijuuIIji h con tains /(ail example of an evaluation 
report prepared using this guide. 

• hpfmiim Tl eggggfta Lti, tiVlt *i '.■■Imi Fa* pn - 
running n I ■ nf mil 'mi — 3 r f Qp-rF-t^^n^o ai» 



ERIC— 



(7 

*■ 12a BEST COPY 



124 



Appendix A 



Example of an Evaluation Report 



This exanple — which 1* fictitious and should not 
be Interpreted an evidence for 01 against any 
pertlculev counselling aethod — illustrates how sn 
experl&ent csn form the nucleus of sn evaluation. 
Notice that Information frost the experiment does 
not form the sole content of the report. The 
©valuator has to consider many contextual, prograa- 
speclflc pieces of inf ornation, such as the exact 
nature of the pro gran, the possible biss that 
night be Introduced Into the data by the Informa- 
tion available to the respondents, etc. There Is 
no substitute for thought fulness and common sense 
in Interpreting an evaluation. 



Program 

Program 
location 

Eval uator 



Report sub- 
mitted to 

Period covered 
by report 

Date report 
submitted 



EVALUATION REPORT 
The Preventive Counseling Program 
Naughton High School 



J. P. Simon, Principal 
Naughton High School 

J. Ross, Director of Evaluation 
Hmleux School District 

January 6, 19xx-February 16, 19xx 



March 31, 19xx 



Section I. Summary 

A new counseling technique based on "reality 
therapy" and the motto that "prevention 1s better 
than cure" was developed by the Mlmieux School 
District and consultants. 

Naughton High School evaluated this Preventive 
Counsel Inq Program by making 1t available to one 
group of students, but not to a matched control 
group. 



Results of teacher ratings subsequent to the Pre- 
ventive Counseling Program and a count of the 
number of referrals to the office, both pointed to 
the success of the PC program *t least on this 
short-term basis. 

This evaluation report details these findings and 
presents a series of recommendations for further 
evaluation of this promising program. 



Section II. _ Background Information Concerning 
The Preventive Counseling Program, 

A. Origin of the Program 

Several counselors had received special training, 
at district expense, 1n a style of counseling 
related to "reality therapy." This counseling was 
designed to be used with students whom teachers 
felt were "heading for trouble" 1n school or not 
adjusting well to school life. By an Intensive 
course of counseling, 1t was hoped to prevent 
future problems, hence the title the Preventive 
Counsel infl Prog ram. The district office asked 
Naughton High School to assess the effectiveness 
of this kind of counseling. A counselor trained 
1n the technique was made available to the school 
on a trial basis for four hours a day over a two 
week period. 

B. Goal of the Program 

The goal of the Preventive Counseling Program (PC) 
was to promote successful adjustment to school 
among students whom teachers referred to the 
office. 

C. Characteristics of the Program 

In the PC program, a student who 1s referred by a 
teacher receives an Initial 20 minutes of coun- 
seling. Follow-up counseling sessions are given 
to the student each day for the next two weeks. 

This program differs from methods used previously 
to handle referrals to the office. Previously, 
teachers were not encouraged to refer students to 
the office. When a student was referred for some 




*'* BEST COPY 

129 



Step-by-Step Guide for Conducting a Small Experiment 



125 



particular reason, he generally received one coun- 
seling session and perhaps no follow-up at all, 
unless the teacher referred the student again. 
This kind of counseling was the responsibility of 
the usual counseling staff or, in exceptional 
cases, the vice- princi pal . 

The PC program: 

1. Uses counselors who are specially trained in 
"reality therapy" counseling 

2. Requests referrals before an incident necessi- 
tate*; referral 

3. Gives the student two weeks of counseling 

P. Students Involved in the Program 

The counseling is appropriate for students of all 
grade levels. Any student referred by a teacher 
1s eligible for counseling. During the trial 
period for this evaluation, however, only some 
referred students could receive the PC program, 

E . Faculty and Others Involved in the Program 

As far as possible, the counselor and teachers 
communicated directly regarding students in need 
of counseling. A clerk h idled scheduling of 
counseling sessions, managing this in addition to 
his other duties. 



Section HI. Description of 
the Evaluation Study 

A. Purposes of the Evaluation 

The District Office wanted Naughton High School 
to evaluate the effectiveness of the new style of 
counseling. The study in this school was to be 
one of several st-J1es conducted to assist the 
District in deciding whether or not to have other 
counselors receive reality therapy training and 
conduct preventive counseling. 

Several School Board members had emphasized that 
they were Interested in seeing firm evidence, not 
opinions. 

B. Evaluation Design 

In view of the costly decisions to be made and the 
desire of the Board members for "hard data," the 
evaluation was designed to measure the results of 
the PC program as objectively and accurately as 
possible. To accomplish this, 1t was deemed 
necessary to use a true control group. Teachers 
were asked to name students in their classes who 
were in need of counseling. For each student 
named, the teacher provided a rating of the stu- 
dent's adjustment to school on a 5-point scale 



from "extremely poor" to "need's a little Improve- 
ment." This was called the adjustment rati ng. 

Students referred jy three or more teachers formed 
the sample used in the evaluation. An average 
adjustment rating was calculated for each of the 
sample students by adding together all ratings for 
a student and dividing by the number of ratings 
for that student. These students were then 
grouped by grade and sex. Matched pairs were 
fcmsd matching students (within a group) with 
close to the same average ratings. 

From these matched pairs, students were randomly 
assigned to receive the new counseling (the 
Experimental or E-group) or to be the Control 
group or C-group. Should students from the con- 
trol group be referred for counseling because of 
some incident, for example, then tte regular 
counselors were requested to counsel as they had 
in the past. The E-group students received the 
two weeks of counseling which is characteristic 
of the K program. 

At the end of the two-week cycle, all referrals to 
the office were again dealt with by regular coun- 
selors or the vice-principal. Over the next four 
weeks, records of referrals to the office were 
kept. If the number of referrals to the office 
was significantly fewer for the students who had 
received the PC program (i.e., the E-group stu- 
dents), then the program would be inferred to have 
been successful. 

This measure is reasonably objective and the ran- 
dom assignment of students from matched pairs 
ensured the initial similarity of the two groups, 
thus making it possible to conclude that any 
difference in subsequent rates was due to the PC 
program. 

C. Outcome Measures 

As mentioned above, the effect of the program was 
measured by counting, from office records, how 
many times each control group student and how many 
times each experimental group student was referred 
to the office in the four weeks after the inter- 
vention program ended. 

An unavoidable problem was that teachers were 
sometimes aware of which students had been 
receiving the regular counseling, since students 
were called to the office regularly for two weeks 
from their classes. Teachers might have been 
influenced by this fact. In order to reduce the 
possible impact of this situation on teacher 
referral behavior, the fact that the evaluation 
was being conducted was not made known until after 
the data collection period was over (four weeks 
after the Preventive Counseling program ended), 

A second measure of outcomes was also collected: 
teachers were asked at the end of the data 



best copy «■ « 13 



126 



Evatuator 's Handbook 



collection period to re-rate all students pre- 
viously identified as needing counseling, giving a 
"student adjustment rating" on the same 5-point 
scale *hich had been used in the beginning of the 
program. 

P. Implementation Measures 

The counselor's records provided the documentation 
for the program. Essentially, these records were 
used to verify that only E-group students had 
received the Preventive Counseling program and to 
record any absences which might require that the 
student not be ~ounted in the evaluation results. 



Section IV. Results 

A. Results of Implementation Measures 

Eighteen pairs of students were formed from teach- 
ers' referrals. The 18 students in the E-group 
had a perfect attendance record during the Preven- 
tive Counseling program and did not miss any 
counseling sessions. However, two students 1n the 
control group were absent for a week. These stu- 
dents and their matched pairs were not counted in 
the analysis thus leaving a total of 16 matched 
pairs. 

B. Results of Outcome Measures 



Table 1 shows the number of referrals to the 
office from the experimental and control groups 
during each of the four weeks following the end of 
the PC program. 

TABLE 1 

Number of Referrals to the Office 





# of referrals to office 


Total 


Week 
1 


Week 
2 


Week 
3 


Week 
4 


E-group (had 
received PC) 


1 


1 


1 


2 


5 


C-group (had not 
received PC) 


3 


2 


3 


2 


10 



There wer* twice as many referrals (10 as opposed 
to 5) in the control group as in the experiwental 
group. Closer analysis revealed that four of the 
referrals 1n the E-group were produced by one stu- 
dent who was referred to the office each week. 
Checking the number of students referred at least 
once (as opposed to the total number of referrals), 
it was found that there were two for the experi- 
mental and six for the control group. 

The second set of averaged school adjustment 
ratings collected from teachers 1s recorded in 
Figure 1, and the calculations for a test of the 
significance of the results are presented in the 
same figure. The t-test for correlated means was 



used to examine thu hypothesis that the E-group 1 s 
average adjustment ratings would be higher* atter 
the program, than those of the C-group. The 
hypothesis couVJ be accepted with only a lOt 
chance that tlie obtained difference was simply the 
result of chance sampling fluctuations. The 
obtained c-value was 2.06, and the tabled t-value 
(.10 level) was 1 .34. 



DATA SHEET 



E-group 


C-grouo 




Student 


Final 
average 
adjust- 
ment 
ratinq 


Student 


Final 
average 
adjust- 
ment 
ratinq 


d 


(d-i) 


(d-J) 2 


AK 


3 


WK 


1 


2 


1.38 


1.90 


GF 


2 


LJ 


2 


0 


- .62 


0.38 


ST 


4 


CF 


1 


3 


2.38 


5.66 


CT 


4 


LM 


3 


1 


0.38 


0.14 


JB 


3 


MH 


3 


0 


-0.62 


0.38 


SK 


3 


FH 


4 


-1 


-1.62 


2.62 


UL 


5 


DH 


5 


0 


-0.62 


0.38 


flQ 


5 


RR 


4 


1 


0.38 


0.14 


JJ 


3 


XT 


1 


2 


1.38 


1.90 


WV 


2 


KN 


2 


0 


- .62 


0.38 


AC 


4 


JR 


3 


1 


0.38 


0.14 


CK 


3 


OF 


4 


-1 


-1.62 


2.62 


CR 


2 


PD 


1 


1 


0.38 


0.14 


RA 


5 


NW 


5 


0 


-0.62 


0.38 


PG 


3 


JM 


4 


-1 


-1.62 


2.62 


FW 


4 


RL 


2 


2 


1.38 


1.90 



n = 16 



10 
13-3 

10 
T6 



21.68 



0.62 



1.20 



t , ff) LrS) 

t „ (Q-62)J4) , 2^48 
1 T72o 1.20 



2.06 



Figure 1 



O 

"ERIC 



131 



BEST COPY 



Step-byStep Guide for Conducting a Small Experiment 



_C. Informal Results 

Several teachers commented informally about the 
counseling that their problem students were 
receiving. One said the counseling seemed to be 
less "touchy feely" and more "getting down to 
sppcifics," and she noted an increase in task 
orentation in a counselee in her room beginning 
at about the second week of special counseling. 
She felt, however, that ihe counseling should 
have continued longer. Other teachers did not 
seem to have ascertained the style of counseling 
being used, but commented that counseling seemed 
to be having less transitory effe- t than usual. 

A parent of one of the counselees in the PC pro- 
gram called the principal to praise the consis- 
tent help his child was getting fra. the special 
co 'iselor. "I think this might turn him around," 
the parent said. 

Negative comments came from one teacher who com- 
plained that one of her students always seemed to 
miss some important activity by being summoned to 
the counseling sessions. Another teacher, how- 
ever, commented that it was a relief to have the 
counselee gone for a little while each day. 



Section V. Discussion of Results 

The use of a true experimental design enables the 
results reported above to be interpreted with 
some confidence. Initially, the E-group and C- 
group were composed of very similar students 
because of the procedure of matching and random 
assignment. The E-group received preventive 
counseling whereas the C-group did not. In the 
four weeks following the program, all students 
were in their regular programs and during this 
time, students from the C-group received twice 
as many referrals to the office as students from 
the E-group. 

In interpreting this measure, it should be remem- 
bered that referral to the office is a quite 
objective behavioral measure of the effect of the 
program. It appears that the Preventive Coun- 
seling program substantially reduced the number 
of referrals to the office over this four week 
period. Whether this difference will continue is 
not known at this time. 

The average post-counseling ratings which teach- 
ers assigned to students in the E-group and in 
the C-group showed a significant difference in 
favor of the E-group. A problem in interpreting 
this result is that the teachers were aware of 
which students had been in the iunseling program 
and this might have affected their ratings. How- 
ever, 52 teachers were involved in these ratings, 
some rating only one student and others rating 
more. That the result was in the same direction 
as the behavioral measure lends both measures 
additional credibility. 



Section VI. Cost-Benefit Considerations 

The program appears to have an initially benefi- 
cial effect. However, it also is a fairly expen- 
sive prograw. There are two main expenses 
involved: the cost of training counselors in 
reality therapy and the cost of providing the 
counseling time in the school. There was no way 
in this evaluation of determining if the training 
had an important influence on the program's 
effectiveness. It could have been that other 
program characteristics— its preventive approach 
or the continuous daily counseling— were the 
influential characteristics. Training In reality 
therapy could possibly be dispensed with thus 
saving some of the expense. However, since 
training can presumably have lasting effects on a 
counselor, its cost over the long-run is not great 
and comes nowhere near appixiaching the cost of the 
provision of counseling time each day. 

It is understood that a cost-benefit analysis will 
be conducted by the District office using results 
from several schools. One question needing con- 
sideration is whether the Preventive Counseling 
program will in fact save personnel time in the 
long run by catching minor problems before they 
develop into major problems. To answer such a 
question requires the collection of data over a 
longer time period than the few weeks employed 1n 
this evaluation. If the program helps students to 
overcome classroom problems, then Its benefits— 
although perhaps immeasurable— might be great. 



Section VII. Conclusions and Recommendation s 
A. Conclusions 



In this small scale experiment, the Preventive 
Counseling program appeared to be superior to 
normal practice. It produced better adjustment 
to school, as rated by teachers, and resulted in 
fewer teacher referrals to the office in the four 
weeks following the end of the two week PC pro- 
gram. It was not possible to determine, from this 
small study, the extent to which each of the pro- 
gram's main characteristics was important to the 
success of the overall program. 

B. Recommendations Regarding the Program 



The Preventive Counseling program is promising 
and should be continued for further evaluation. 

Preventive Counseling without the reality 
therapy training might be instituted on a 
trial basis. 



C. Recommendations Regarding Subsequent Evalua- 
tion of the Program 



1. The kind of evaluation reported here, an eval- 
uation based on a true experiment and fairly 
objective measures, should be repeated several 



ERLC 



BEST copy 



2/ 



132 



128 



Evaluator's Handbook 



times to check the reliability of the effects 
of counseling as so measured. 

2. In several evaluations nf the Preventive Coun- 
seling program, the outcome data should be 
collected over a period of several months to 
assess long-term effects. 

3. Tl "» School Board and the schools should be 
provided with a cost analysis of the coun- 
seling program which includes a clear indica- 
tion of (a) the alternative uses to which the 
money might be put were it not spent on the 
PC program, and (b) the cost of other means 
of assisting students referred by teachers. 

4. An evaluation should be designed to measure 
the relative effectiveness of the following 
four programs: 

• The Preventive Counseling program 

• The Preventive Counseling program run with- 
out reality therapy training 

• Reality therapy provided to regular coun- 
selors 

• The usual means of handling referra 





133 



HOW TO DEFINE YOUR ROLE AS AN EVALUATOR 
(DRAFT) 

Revision Author: Brian Stechcr 
December 6, 1985 



131 



How To Focus An Evaluation 



Brian Stecher 
Outline 

Introduction 

A. Purposes of the book. 

B. What it will and will not tell you. 

C. Chapter by chapter overview. 

Chapter One: Presenting a model for focusing an evaluation. 

A. Preliminary comments/caveats. 

1. Limitations of any model of complex human interactions. 

2. Value as a tool for learning and instruction. 

B. What are the elements of the focusing process? 

1. Acknowledging existing beliefs and expectations. 

a. Evaluator has beliefs about the meaning of evaluation, 
embodied in a particular approach . 

b. Client has expectations for the evaluation, based upon 
needs and wants. 

2. Gathering information. 

a. Evaluator seeks information about many topics. 

(1) Client's needs and expectations. 

(2) Program goals and activities. 

(3) Other concerned individuals or groups. 

(4) Constraints and limitations, etc. 



ERLC 



b. Client seeks information *bout nvrny topics. 

(1) Evaluator's capabilities. 

(2) Value and limitations of evaluation. 

(3) Evaluation procedures, etc. 

3. Narrowing the focu* and formulating a tentative strategy. 

a. Establishing priorities. 

b. Formulating preliminary plans. 

c. Melding the evaluator': approach and the client's 
expectations, 

4. Negotiating an evaluation plan. 

a. Specifying evaluation questions, 

b. Clarifying procedures. 

Chapter Two: Thinking about client concerns and evaluator approaches. 

A. Client needs and expectations. 

1. Why consult an evaluator? 

a. Legal mandates. 

b. Stated program goals and objectives. 

c. Specific questions. 

d. General concerns or problems. 

2. Client conception of evaluation. 

B. There are different approaches to evaluation. 

1. What we mean by an "evaluation approach." 

2. Derivation of these points of view. 

C. The research approach. 

1. Conception of tne meaning and purposes of evaluation. 

2. Methods for accomplishing these purposes. 



ERLC 



13G 



D. The goal -oriented approach. 

1. Conception of the meaning and purposes of evaluation. 

2. Methods for accomplishing thest. purposes. 

E. The decision-focused approach. 

1. Conception of the meaning and purposes of evaluation. 

2. Methods for accomplishing these purposes. 

F. T K e user-oriented approach. 

1. Conception of the meaning and purposes of evaluation, 

2. Methods for accompl i sii*..g these purposes. 

G. The responsive approach. 

1. Conception of the meaning and purposes of evaluation. 

2. Methods for accomplishing these purposes. 

H. Comparison of approaches. 

1. Similarities: information, validity, usefulness, etc. 

2. Differences: research paradigm, degree of objectivity, ro^e 
of the evaluator, etc. 

Chapter Three: Hew to gather information. 

A. Introduction: This is a simplified discussion of a complex, 
interactive procedure. 

1. It is a dynamic process that differs in each casj. 

2. As the expert the evaluator is likely to have strong 
influence. 

3. Tnere are fundamental concerns common to all evaluators can be 
captured fn four or five basic questions. 



ERLC 



B. "What is the program all about?" 

1. Obtaining information about the program. 

2. How different evaluators might ask this question. 

a. What variables do you want to study? 

b. What are your goals ana objectives? 

c. What decisions are going to be made? 

d. Who is likely to use the information? 

e. Who is affected by the program? 

3. How different clients might respond* 

C. "What do you wa-nt to know about the program?" 

1. Illustrations of various points of view. 

2. Questions each evaluator will want to have answered. 

D. "Who else is concerned and may need to be involved?" 

1. Extending the information base. 

2. How different evaluators would address this issue? 

E. "Why do you want this information?" 

1. Clients view of purposes for the evaluation. 

2. Impact on different evaluators. 

F. "What constraints or limitations are there?" 

!. What practical limits exist: money, time, access, etc.? 

2. What contextual constraints exist: attitudes, politics, 
belief.;, etc.? 

3. How would different evaluators address these issues? 

Chapter Four: How to narrow the focus and develop tentative plans. 

1. Developing and revising plans dS information is gathered. 

2. Establishing ^iorities. 



138 



3. Adapting strategies to fit particular situations. 

4. Balancing evaluator's point of view and client's wishes. 



Chapter Five: How to negotiate an evaluation plan. 

1. Desired outcomes. 

a. Specific objectives and evaluation questions. 

b. Methods and procedures. 

2. Options for the evaluator. 

a. Reach a collaborative agreement. 

b. Decline to conduct the evaluation. 



Closing Comments 

1. Summary 

2. Return to the Kit. 



13,'i 



HOW TO DESIGN A PROGRAM EVALUATION 
(DRAFT) 

Revision Author: Joan Herman 
December 6, 1985 



ERJ.C 



140 



Chapter 1 

An Introduction to Evaluation Design 



A design 1s a plan which dictates when and from whom Information Is to 
be collected during the crjrse of an evaluation. The first and obvious 
reason for using a design 1s to ensure a well organized evaluation study: 
all the right people will take part 1n the evaluation at the rlgiit times. 
A design, however, accomplishes for the evaluator something more useful 
than just keeping data collection on schedule. A design 1s most basically 
a way of gathering data so that *he results will provide sound, credible 
answers to the questions the evaluation 1s to address. 

The term design traditionally has been used 1n the context of 
quantitative evaluation studies where judgments of program worth or of 
relative effectiveness are a primary consideration. This book 1s addressed 
to designs for these types of studies. It 1s Important to note, however, 
that there are occasions when a quantitative study does not represent the 
best approach to answering Important evaluation questions, where 
qualitative approaches may be more appropriate. Design 1s equally 
Important 1n assuring the quality of Information derived from qualitative 
studies. The reader 1s referred to How to Conduct Qualitative Studies for 
a discussion of Important design Issues 1n these latter types of studies. 

What 1s the purpose of design 1n quantitative studies? A design 1s a 
p an for gathering comparative Information so that results from the program 
being evaluated can be placed within a context for judgment of their size 
and worth. Designs reinforce conclusions the evaluator can draw about the 
impact of a program by helping the evaluator to predict how things might 
have been had the program not occurred or 1f some other program had 
occurred Instead. 

ERIC 141 



Chapter if 



An Introduction 
to Evaluation Design 



A design kzplan which ticutepwhen zndfrpm whom measurements will 
be gathered during the course of an evaluation. The first and obvious 
reason for using a design is to ensure a yrtll organized evaluation study: all 
ther rightpeople will^t&e part in the evaluation at the right times. A 
(fesigryhowever, accomplishes for the evaluator something more useful 
tftanjOst keeping data collection on schedule. A design is most basically a 
wvf of gathering comparative information so that results from the pro 
gram being evaluated can be placed within a context for judgment of their 
size and'worth. Designs reinforce conclusions the evaluator can draw about 
the impact of* program by helping the evaluator to predict how tkings 
might hatfheen hadth^program not occurred or if some other program 
nWoccrfxiejUmtga^^ comparative data collected could include how 
the school environment might have looked, how people might have felt, 
and how participants might have performed had they not encountered the 
particular program under scrutiny. Usually i design accomplishes this by 
prescribing that measurement instruments-tests, questionnaires, observa- 
tions—be administered to comparison groups not receiving the program. 
These results are ihei? comparrj with those produced by program partici- 
pants. At other times, predictions about what would have happened in the 
program's absence can be produced without a comparison group through 
application of statistical techniques. 



1. Some writers have used the word model instead or design, probably because 
the choice of such a measurement plan usually affects the evaluator's whole point of 
view about the seriousness of the enterprise and about how information wfii be 
leathered, analyzed, and presented. This book prefers design, the less ponderous term, 
and it will be used throughout. 





ERIC 



10 



How to Design a Program Evaluation 



The objective of this book is to acquaint you with the ways in which 
evaluation results can be made more credible through careful choice of a 
design prescribing when and from whom you will gather data. The book 
helps you choose a design, put it inio operation, and arMyze and report 
the data you have gathered The book's intended message is that attention 
to design is important. 

Even if choice or practicality dictate thai you ignore the issue of design, 
it is important that you. understand the data interpretation options which 
you have chosen to pass by. In the majority of evaluation situations, some 
comparative information is better than none. Your choice of a design will 
perhaps determine whether the information you produce is believed and 
used by your evaluation audience or shrugged off because its many 
alternative interpretations render it unworthy of serious attention. 

The book's contents are based on the experience of evaluators at the 
Center for the Study of Evaluation, University of California, Los Angeles, 
on advice from experts in the field of educational research, and on the 
comments of people in school settings who used a field test edition. The 
book focuses on those evaluation designs which seem most practical for 
use in program evaluation. Please be aware that these are not the only 
designs available for adoption as bases for useful research. They do seem to 
be, however, the most straightforward and intuitively understandable. This 
makes them likely to be accepted by the lay audiences who will receive 
and must interpret your evaluation findings. Please bear in mind, in 
addition, that many of the recommended procedures in this book pre- 
scribe the design of a program evaluation under the most advantageous 
circumstances. Few evaluation situations exactly match those envisioned 
here or described in the book's myriad examples. Therefore, you should 
SSljxpect to duplicate exactly suggestions in the book. Evaluation is a 



relativelyATnewfield, and correct procedures, even where choice of a design is 



concerned, are not firmly established. In fact, while considerable attention 
has been given to the quality of measurement instruments for assessing 
cognitive and affective effects of programs, relatively little attention has 
been paid to the provision of useful designs. Your task as an evaluator is to 
find the design that provides the most credible information in the situation 
you have at hand and then to try to follow directions as faithfully as 
possible for its implementation. If you feel youTl have to deviate from the 
procedures outlined here, then do. If you think the deviation will affect 
interpretation of your results, then include the appropriate qualifications 
in your report. 

If political pressures or the heat of controversy make it important that 
you produce credible information about program effects, few things will 
support you better than a well chosen evaluation design. Often evaluators 
discouraged by political or practical constraints have chosen to ignore 
design, perhaps cynically deciding that a good design represents informa- 




ERLC 



143 



An Introduction to Evaluation Design 



tion overkill in a situation where little attention will be paid to the data 
anyway. The experience of cvaluators who have chosen to use good design 
has been to the contrary. The quality of information provided through use 
of design has often forced attention to program results. Without design, 
the information you present will in most cases be haunted by the possi- 
bility of ^interpretation. Information from a well designed study is hard 
to refute; and in situations where they might have been ignored or 
shrugged off because of many or ambiguous interpretations, conclusions 
from a good design cannot be easily ignored. 

The Program Evaluation Kit, of which this book is one component, 
is intended for use primarily by people who have been assigned the role of 
program evaluator. The job of program evaluator takes on one of two 
characters, and at times both, depending upon the tasks that have been 
assigned: 

You may have responsibility for producing a summary statement about 
the effectiveness of the program. In this case, you probably will report 
to a funding agency, governmental office, or some other representative 
of the program's constituency. You may be expected to describe the 
program, to produce a statement concerning the program's achievement 
of announced goals, to note any unanticipated outcomes, and possibly 
to make comparisons with alternative programs. If these are the fea- 
tures of your job, you are a summative evaluator. 
2. Your evaluation task may characterize you as a helper and advisor to 
the program planners and developers or even as a planner yourself. You 
may then be called on to look out for potential problems, identify areas 
where the program needs improvement, describe and monitor program 
activities, and periodically test for progress in achievement or attitude 
change. In this situation, you are a "jack of all trades," a person whose 
overall task is not well defined. You may or may not be required to 
produce a report at the end of your activities. If this more loosely 
defined job role «eems closer to yours, then you are a formative 
evaluator. 

The information about design contained in this book will be useful for 
both the formative and summative evaluator, although the perspective of 
each will vary. 



Typically, design has been associated with summative evaluation. After all, 
the summative evaluator is supposed to produce a public statement sum- 
marizing the program's accomplishments. Since this report could affect 
important decisions about the program's future, the summative evaluator 
needs to be able to back up his findings. He therefore has to anticipate the 



Designs in Summative Evaluation 




144 



12 



How to Design a Program Evaluation 



arguments of skeptics or even the outright attacks of opponents to the 
conclusions he presents. While good design won't immunize nim against 
attack, it will strengthen his defense. Historically, designs were developed 
as methods for conducting scientific experiments, methods through which 
one can logically rule out the effect on outcomes of anything other than 
the treatment provided. In the case of educational evaluation, this treat- 
ment is an educational program. Since designs serve the interest of pro- 
ducing defensible results, and since such production fs primarily the 
interest of the summative evaluator, you will find throughout the book a 
strong summative flavor in both the procedures outlined and the examples 
described. 

To readers who are working right now as evaluators, the suggestion that 
design is of critical importance for summative evaluation may seem a little 
off-base. "No one uses experimental designs" you might say. "No one 
uses control groups." And you would betiearly correct, unfortunatcly-at 
least with regard to large Federal and State funded programs. Not long ago 
a study of a nationwide sample of ESEA Title VII (Bilingual Education) 
evaluations revealed that no one attempted to use a true, randomized 
control group, and only 36% tried to locate a non-randomized control 
group for comparison with any aspect of the programs evaluated. 2 In 
another study, a search of 2,000 projects that had received recognition as 
successful located not one with an evaluation that provided acceptable 
evidence regarding project success or failure. 3 

The reasons for this state of affairs are no doubt legion, but four come 
up frequently: 

1. Flinders seem to view programs as one-shot enterprises. Once a program 
has been implemented and has run its course, it becomes a fait accom- 
pli. It's over. Summative reports, then, describe something that has 
already happened. They are seldom seen as a chance to describe 
programs and their effects in the interest of future planning. In order to 
testify that a program took place at all, a summative report need not 
use a design. Designs become valuable only when someone hopes to use 
information about program processes and effects as a basis for future 
decisions such as whether to pay for similar programs or to expand the 
current one. Designs are essential when someone has in mind the 
development of theories about what instructional, management, or 
administrative strategies work best. 



2. Alkin, ht. C. f Kosecoff, J., Fitx-Cibbon, C, * Seiigman, R. Evaluation and 
decision making: Hie Title VII experience. CSE Monc^ph Series in Evaluation, No. 
4. Los Angeles: Center for the Study of Evaluation, 1974. 

3. Foat, C. M. Selecting exemplary compensatory education prefects for dissem- 





145 



ERIC 




and/or political 
concerns, it is 
often difficult 
to accomplish the 
goat rigorous 
designs . Social 
programs often are 



I An Introduction to Evaluation Design 



13 



2. Erahmtors an called in too lata. This problem is actually a common 
symptom of the first Evaluation often occurs as an afterthought Lack 
of careful planning in the establishment of the program removes the 
aimed at individuals! P 0 **^* * carefully planned evaluation. The eriuator finds that ht 
or groups in great j* 1 * n0 contro1 Wk *■ wignment of studenU or the sites chosen for 
need and withhold- / topleraentation of the program. The evaiuator has to "evaluate" an 
in* ootential nro / ***** on "« oin « ProgTim- While this situation does not eliminate the 
F ' possibility of obtaining good comparative Information, it usually mike j 
of the best dssigns impossible. 
^A.Socud science march in general to still (nits youth. Lack of research 
design in evaluation stems partly from its relative novelty as a method 
for gathering aodal adence information at all. Sir Ronald Fisher's work 
to statistics and design, an essential methodological step forward for the 
j aodal sciences, was completed in the 1 930's! Not very long ago. 
L $. Educational r esearc h ers and evahtators themselves cannot agree about 
the appropriateness of research designs for evaluation While most 
writers in the field of evaluation concur that at least a part of the 
•valuator's rote is to collect information about a program* the nature of 
the rules governing data collection are stfll debated. Opponents of the 
use of design usually list as major drawbacks the political and practical 
constraints discussed here already, and the technical difficulties in- 
volved with using the findings from one rauluTaceted program to 
predict the outcomes of others. 

Defenders of dWgn, the authors of this book among them, acknowl- 
edge these dtejaeseeam. TTiey continue to urge the use of design in field 
settings because designs yield the comparative information necessary 
for establishing a perspective from which to judge program accomplish* 
merits. In fields of endeavor such — education, where dear absolute 
standards of performance have not besn set, comparison is a way to 
subject programs to scrutiny in ordei eventually to determine their 
value. Nonetheless, some of these Impediments to good 
design axe more intractable t han oth ers. Suggestions 
Summative Evaluation and Educational f ~ ^ ~ 



gram benefits fro 
some for the 
sake of a compara- 
tive research 
design can be hard/ 
to justify. In ' 
addition, it is 
frequently the 
case that politics 
rather than social 
science methodology 
determines where 
or for whom specie 
programs will be 
implemented, pre 
eluding opportunities 
for randomized 
designs 



Summative evaluations should whenever possible employ experimental 
designs when examining programs that are to be judged by their results. 
The very best summative evaluation has afl the characteristics of the best 
research study. It uses highly valid and reliable instruments, and it faith- 
fully applies a powerful evaluation design. Evaluations of this caliber could 
he published and disseminated to both the lay and research community. 
Few evaluations of course will live up to such rigid standards or need to. 
The critical cltaracteristic of any one evaluation study is thai il provide the 
best possible Information that could Hon bean collected under the circum- 
stances, and that this Information meet the credibility requirements of its 



problems 



Insert: 

aiie offered at the 
end of this 
chapter for opti- 
mizing those 
■situations where 
there are signifi- 
cant intractable 
constraints . 



BEST COPY 



9 

ERJC 



146 



14 



How to Design a Program Evaluation 



evaluation audience. The best interpretation of your task as summative 
cvaluator is that you must collect the most believable information you 
can, anticipating at all times how a skeptic would view your report. 
Keeping this skeptic in mind, set about designing the evaluation which has 
potential for answering the largest number of criticisms. 

The aim of the researcher is to provide findings about a program which 
can be generalized to other contexts beyond it. Criteria for what consti- 
tutes generalizable information have been agreed upon by the social 
science community; they arc the topic of educational research texts. 
Though it is important as a service to education that the valuator provide 
such information if the situation allows good design and high quality 
instrumentation, the cvaluator can usually limit his projection of the 
quality of data he must collect to what he perceives will be acceptable to 
his unique audience. It is not beyond the scope of the evaluator's job, 
however, to educate his audience abdut what constitutes good and poor 
evidence of program success and to admonish them about the foolishness 
of basing important decisions on a single study-or even a few. It is equally 
within the summative evaluator's task to advocate, based on his informa- 
tion, changes in a program or in funding policy or to express opinions 
about the program's quality. The evaluator who takes a stand, however, 
must realize that he will need to defend his conclusions, and this again 
means good data and a well designed study. 



All this discussion about design in summative evaluation should not 
persuade the evaluator that design is irrelevant in the formative case. The 
use of design during a program's formative period gives the evaluator, and 
through her the program staff, a chance to take a good hard look al the 
effectiveness of the program or of selected subcomponents. This enables 
the formative evaluator to fulfill one of her major functions-to persuade 
the staff to constantly scrutinize and rethink assumptions and activities 
that underly the program. Careful attention to design can also help the 
formative evaluator to conduct small-scale pilot studies and experiments 
with newly-developed program components. These will inform decisions 
among alternative courses of action and settle controversies about more or 
less effective ways to install the program. 

The message to the formative evaluator is this: Including a source of 
comparative inferrmation-a control group or data from time series ir.ea- 
sures-in any information-gathering affort makes that information more 
interpretable. Too often formative measurement happens in a vacuum; no 
one can judge whether students are making fast enough progress, for 
instance, br cause no one can answer the question "Compared to what?" 



Designs in Formative Evaluation 



i. 




0 

ERJC 



14/ 



An Introduction to Evaluation Design 



Example 1. Franklin Elementary School has designed a pull-out pro- 
gram in reading for slow readers and wishes to assess the quality of the 
progress or the students during the program's tint year of operation. 
The hope is that students in the pull-out program will make faster 
progress because of increased attention. The problem is know Ing what 
pace to expect from slow students. The vice-principal, serving as pro- 
gram evaluator, has located a school in the same district which uses the 
same programmed readers which form the backbone of the pull-out 
program -the O'Leary Series. The evaluator has persuaded the principal 
of the ether school to allow her periodically to test their slow readers 
for comparison with Franklin's. The evaluator has constructed a test 
using sample sentences from the O'Leary series which will be admin- 
istered for oral reading by both the pull-out program students and the 
students in the other sr 100I. Since it is the first year of the pull-out 
program, information g ined from comparing the two schools will be 
used formatively. If Franklin's readers arc not progressing faster than 
the controls, then this might signal a need for modification in the 
pull-out program. The design here is Design 5, the Time Series with 
Non-Equivalent Control Group, described in Chapter 5. 



Example 2. Osirus High School designed a six-week career awareness 
module for tenth grade based on field trips in which all students spend 
one afternoon a week at the work places of professionals pursuing 
careers in which the students are interested. The students conduct 
interviews and write short biographies describing each professional's 
route to success. Due partly to the extreme cost of such a large-scale 
field program, the school's director of vocational education decided to 
do some formative evaluation, assigning students randomly to the first 
six-week program tryout This provided a Design 2 evaluation (Chapter 
4), since nc pretest was given. At the end of the six-week module, an 
achievement test revealed that students had acquired large amounts of 
information about the careers of their choice, and were able to write 
essays which the career education staff judged to be realistic appraisals 
of the economic and social accompaniments to these careers. Students 
also seemed to have acquired a good sense of the steps necessary to 
attain an education toward the career of interest A look at the control 
group, however, showed that students who had not taken part in the 
career education program had acquired the same information and the 
same set of realistic expectations simply through talking to students 
who were taking part in the field test It seemed that it might not be 
necessary for every student to go into the field every week-at least this 
didn't seem critical for making cognitive gains. 



For formative evaluation, it is a good idea at the outset to locate or 
assemble a control group, as described in Designs 1, 2, and 3, or to collect 
time series measures before the program begins (Designs 4 and 5). Laying 
down even the rudiments of design will give you a chance to make 
comparisons in order to interpret your findings or to justify your forma- 
tive recommendations if you should need to. 





How to Design a Program Evaluation 



^ cc ? u ^, 9? thejjrfwBwifcy of your job, you can try using designs for 
formative evaluation in several ways, according to your own discretion: 

I. You might set up as "controls" various alternative versions of the 
program you ate helping to form. You may be able to identify alterna- 
tive versions that the program can take, possibly one or more less costly 
or time-consuming than the others. You could set up two or more 
versions in different schools or classrooms, some receiving the more 
expensive or more lengthy alternative. These alternatives could vary in 
the amount they differ from the basic program, as well as in their 
duration. They could last a short time, say, until someone has deter- 
mined their relative quality; or they could span the duration' of the 
whole program, providing you at the end with an assessment of their— 
and its-overall effectiveness. If whole schools or classrooms received 
the alternative version, your evaluation would comprise a Design 3 
study (Chapter 4), an evaluation With a non-equivalent control group. If 
you have programs going on in several different classrooms to which 
you can randomly assign students, you can implement a true control 
group design. Often the "control group** tends to be thought of as a set 
of losers, people who unluckily miss out on all the good benefits of the 
program. In a design which sets two competing versions of the program 
operating at the same time and where each of them is equally viable and 
potentially effective, exactly which group is the experimental group 
and which the control is really not worth considering. 



Example !• A junior high school language am teacher is in the process 
of designing a writing curriculum for seventh and eighth grades. Rea- 
lizing that motivation is a strong determiner of junior high performance 
in any topic, the teacher has come up with four ways to motivate 
students to write; but of course he doesn't know which wilJ work best, 
ir if some will work better with some students than others. To give him 
.nis needed information for future planning, he has decided to perform 
a formative evaluation using one of the different strategies in each of 
his four roughly comparable, heterogeneously-grouped classes: One 
group will edit its own magazine; another will write articles to be 
submitted to popular national magazines; a third group will write letters 
to the editor of the local newspaper; and a fourth group will write a 
play about the problems of adolescence. The teacher hopes to take 
farther advantage of this instance of Design 3, the Non-Equivalent 
Control Group Design (Chapter 4), by analyzing results on periodic 
writing exams separately for students whom he assessed to be good or 
poor writers in the first place. 

d 

Example 2. A district-wide Early Childhood Education program has 
decided to incorporate a psycho-motor development component that 
will require installation of large playground equipment. In order to 
answer many questions about the best way to integrate the program 



i4:> BEST COPY 



An Introduction to Evaluation Design 



into the overall early childhood curriculum, the district's Assistant 
Superintendent Tor Early Childhood Programming has decided to install 
the equipment in two week phases to groups of randomly chosen 
schools. The entire pool or the district's elementary schools will be 
divided randomly into eight groups. The groups will receive the equip- 
ment and begin the program at two-week intervals. Having eight groups 
begin using the materials in this step-wise fashion will give the staff a 
chance to do formative evaluation. After administering a pretest, they 
will work with Group 1 for two *veeks and then administer a psycho- 
motor unit test They will make necessary program modifications, 
then initiate the program with Group 2. Group 2's pretest results, 
because of randomization, should match Group l's. Results of the unit 
test with Group 2, however, can be compared with results from Group 
1 to determine if program modifications have had an effect on student 
development This revision/program installation/test cycle can repeat as 
many a* six more times, or until the program seems to be yielding 
maximal gains. This useful formative design is actually a version of 
Design !, the True Control Group Pretest-Posttcst, derailed in Chap- 
ter 4. 



Tempered by proper caution about the danger of basing extremely 
important decisions on studies with small numbers, "formativ; planned 
variations" allow you to rest program planning on more than hunches. 

You might relax some of the more stringent ^quirements for imple- 
menting a design. Since formative evaluation tagrifeu^ikcts informa- 
tion for the sole use of program staff, the formative cvaluator can, 
where necessary, relax some of the requirements for setting up a design. 
This means that, when necessary, you can use assignment of students 
that is slightly less than random, or choose a non-randomized contiol 
group from students of a somewhat different socioeconomic group, as 
long as interpretation of results is accompanied by appropriate caution. 
The formative cvaluator can at times relax design constraints because 
the formative evaluator's constituency is the program staff. They will 
use the data he gathers to make program change decisions. They will, in 
addition, serve not only as judges of what constitutes credible informa- 
tion but they will, through constant contact with the program, gather 
much of their own d^.ta-at least that concerned with attitudes and 
impressions. In situations when the formative cvaluator has been able to 
set up comparison trials of various program versions, staff members 
inevitably gather first-hand experiences to use as a basis for making 
program revisions. 

Regarding design, the job of the formative evaluator seems to be to 
provide many opportunities for comparison, using as good a design as 
possible. The details of the implementation of any one design are not 
critical. 





How to Design r Program Evaluation 



Example. Jackson Elementary School, in the heart of a large urban 
area, received Federal funds to design a compensatory education pro- 
gram for + middle grades, with a particular focus on basic skills. The 
school .entitled the students eligible for the program according to the 
state'* requirements for receiving the funds. By and large, these stu- 
dents were chronically low achievers. A young and devoted school start 
had ideas about how best t%< use the money: they installed an Enrich- 
ment Cento bused on open school Guidelines, and u much of the 
money to hire classroom aides. They were interested . cecping close 
watch on M q'tality of achievement that their first year of program 
opeution produced, but they could not locate a control group. Some* 
one su**ested th~t the students in the sci.ool who traditionally per- 
forme** slightly iiow average but not as poorly as the target students 
migh* form a rough control group for the study. Subsequently the 
decision was made that these students would be tested for progress in 
fading, math, and writing at the same times and using the same pre and 
post measures as the program Itudents. This is a modification of Design 
3, the Non-Equivalent Control Croup design (Chapter 4), with a special 
awareness that the control group is indeed non-equivalent. The control 
group, for one thing, did score just significantly higher than the pro- 
gram students on a standardized pretest A careful wstcb over the 
course of the school year, however, showed that program students 
recekc J extensively more attention In basic skills areas and at the end 
of the year were achieving about the same as the control group. Such a 
design helped the striT conclude that the new program indeed did 
benefit th- target students: they were now achieving as well as students 
who had scored better than them in the past. 



An exception to this pronouncement about formative evaluation and j 
more relaxed designs occurs in the case of controversies within the staff 
over different versions of program implementation. One of the jobs of 
the formative evaluator is to collect information relevant to difference., 
of opinion about how the program should be designed or implemented. 
In this nase, as with summative evaluation, challenges to the conclusive- 
ness of results can occur, and credibility will become again important. 
Disagreements among planners can be translated into alternative treat* J 
ments to form bases for srpall experiments designed according to the 
guidelines in this book. ! 
3. You might want to perform short experiments or pilot tests. You will j 
find t A progrun planners must constantly make decisions about h v 
a program will look. Most of these decisions must be made in the 
absence of knowledge about what works best. Should all math instruc- 
tion take place in one session* or should there be two during th* day? 
How much discussion in the vocational education course should pre* 
cede field trips: How *h should fc low? Will reading practice on the 
Readalot machine produce result? as good as when children tritor one 
another? How much worksheet work can be included in the French i 





An Introduction to Evaluation Design 



course without damaging students' chances of attaining high conversa- 
tional fluency? You can settle these questions by believing whoever 
offers the most convincing opinion, or you can subject them to a test. 
Using one of the evaluation designs described in this book, particularly 
Designs 1, 2, or 3, you can conduct a short study to resolve the issue. 
Read Chapters 2 and 3. Then choose treatments to be given to students 
(or whomever) that represent the decision alternatives in question. The 
duration of the short study should last as long as ycu feel will realisti- 
cally allow the alternatives to show effects. If you will be reading this 
book for the purpose of designing short experiments, please substitute 
the word treatment for program as you read the text. The designs 
described in the book, and the procedures outlined for accomplishing 
them are, of course, equally appropriate. 



Example. A g-oup of third gnde teachers attending a convention heard 
about a mathematics game which they thought would teach multiplica- 
tion tables painlessly if played every Friday morning. Interested in 
»ving their students the agony of drill, the teachers urged their princi- 
pal to purchase the game. The principal, a former math teacher, was 
skeptical f the value of what she called "playing bingo*'* She refused. 
The teachers, however, persuaded the principal to agree to a test: they 
would randomly distribute students among their tour classrooms every 
Friday morning for four weeks, carefully controlling the numbe: of 
high and low math ability students distributed to each classroom. Two 
of the teachers would play tne math game; the other two would drill 
their students in the same multiplication tables, and give prizes for 
knowing tables exactly like those to be won playing xH game. At the 
end of the month, the data would be allowed to .peak for themselves. 
This highly credible Design 1 study would uncover differences between 
drill and the program if any were to be attained. 



Use of design requires planning in *dvano , if only to locate a group 
that is willing to serve as the comparison. Even if you have no intention of 
collecting comparative data at the outset, it might be a good idea to locate 
a handy group from whom you will be able to pull students in order co try 
out new lessons or plans or to do short experiments. Often you wOl find a 
teacher who is not taking part in the program who will be glad to provide 
you with a little time to give supplementary instructions short quiz or 
a questionnaire to his class. 

Evaluation Where Design Presents Problems: 

PROCMMS AIMED AT SPECIAL POPULATIONS 

Many evaluators find themselves in the position of col- 
lecting information about the quality of funded programs 
aimed at helping students, clients or others who are ex- 
remely rich or poor in a certain disposition, ability or 
attitude. In * school setting, for example, these special 



BEST copy 



20 



How to Design a Program Evaluation 



categories of children might score, for instance, In the* top 2% on in IQ 
tut and be labeled gifted, or below 75 IQ end be classified as retarded. 
The students may be handicapped or emotionally disturbed. Programs 
aimed at these students present unique design problems because laws 
requiring that all ssA children be educated rule out evaluation designs 
where the contrr* ;rcu^ rectos no special program. A comparison group 
can therefore only be formed u the school hat two programs available for 
special students. 



Eunpfc. A aehool triad two differeat sis* of pfograau for its i"*** 
itedaats. Gifted atadaats *tn raodooOy eeafeead to otm ot anopwr 
proff «n for a 1 Oweefc triel r^d, aw the ead of whkh eteefiu from 
bom Ofognms weft eajarjd by the priedpal. Raactfoos of students 
anrt pateis wete aoatttwj roc both progawat, b«t ooa proarajQ involving 
Held >4ps rasaad r oeo l difi b li iee»ueeat frees stedrats oot in tha 
pieatem. Since they weie nnaok to jastffy tht rWd trios i 
tbt priedpal and statYchosa to conflows with Om other proaram. 

The following paragraphs sugges* 
other possible approaches- 
to evaluation of special . ^ 
programs. The reader is*r " 

also referred to £ow To ( *• ***** "on-equivalent control group design (Design 3, Chapter 4). 
Conduct Qualitative 

discussion 



Such a comparison could be made if another district or school with no 
special programs, or programs appreciably different from yours, agreed 



Studies for a . . . 

of alternative approaches V ogive the ««P>e tests as youn and to share results. 



ExaaaaSa. Teachers of edvcabk mentally retafdad stndtats pUnntd a 
rsading sMBs profftm which thay hoped would sfcrincantly iroprov* 
tht reading of their EMR stedents. They esked e nearby ttamtntary 
school to dun with thorn rfnltt of a rwsdta* tast s>on by tha district 
fatoyeachyesrarttoparattacfHtfta 
thaEMR itwdaatt at tha bagtaaang aad awd of tho adiool yaar. Progm 
of ttu two crowns fa leeaUng coold b* a 



I Adopt a formative ap- 
proach and evaluate 
program components > 



Comparative 

studies of the effacts of whole programs are not alwayr the best service 
you can provide to the program staff or even the funding agency. 
Rather, more useful information can be gained by evaluating com- 
ponentt of a special education program with a view to k anunending 
changes that might be needed in these. In some cases, for example, 
alternative materials might be available for teaching the same objectives. 
Small scale experiments could be set up in several schools, using s 
pretest-posttest true control group design (Design 1, Chapter 4) in each 
classroom to obtain objective data on the effectiveness of the various 
alternatives. 



BEST COPY 

153 



An Introduction to Emttmtion Design 



21 



indicator » e.g., satisfaction with program out- 
comes uaing Design 3. Sometimea an evaluator is 
asked to evaluate a number of apeclal pro gx awe 
which individual achools or projects have produced 
and which all have different goals and objectives. 



For example, perhaps \ 3. Compare diverae programs in terms of 
at one school, the \ 
gifted program con- 
centrates on acceler- 
ation in math, at 
another on breadth of 
exposure In science, 
at another on creative 
writing, and at a fourth 
on all theae things at 
once. You could measure; 
at all schools student 
and parent satisfaction 
with the instruction 
provided In individual 
subjects (math, aclence, 
writing akllla). Per 
haps you would find 
results like this: 



some common 



\ 



High 



firnt 
Satisfaction 



/ 




eraatlva 
writing 

ac lanes 



School - 

(and reported 
•aphasia of 
giftod pro* 
gro*) 



A 

ath) 



(acltnct) (crtativa (avary- 
writing) thing) 



In rnmend, panmts seem equally satisfied with both math and creative 
writing, no matter what lie eraphesU reported by the school. Satirise* 
tion with sdsnce, however, seems very sensitive to whether or not 
science is emp*_itiz»d by the gifted program. When it is, there is high 
MtJtfaction. The evaluator might note that in the absence or special 
effort, science might not be well tsufiht to gifted students, at least if 
parent utiifaction is a valid indicator. 

The point of this exsmpit is that diverae programs can be assessed If 
you can find s single dimension on which to compare them. Opinions 
and attitudes often provide this common ground. This kind of mvestJga* 
tion at least teC* you what kind of programs stem to nuue s difference 
on the dimension you have d}oeen u 
4. Compare program outcomes to pre-eatablished criteria 
and use Design 6 (Before-end-Af tar Design, Chapter 6). 
Frequently, special programs are required to stste 
measurable goals, and the evaluator 1 a job Is to mea- 
sure goal achievement. Thla often turns Into a 
game of who can set goals which are lofty enough to 
be acceptable but simple enough to be resched, es- 
pecially when goals are set in terms of standardize. 

test galna. Sometimes, however, when the ~> 

i 



BEST COPY 

151 



22 



How to Design a Program Evaluation 



-goals arc derived from criteria which have intrinsic, recognizable value, 
reasonable goal setting is an excellent approach. Fcr example, specifica- 
tion of some basic survival skills, such as reading road signs correctly 
and making change, for retarded students, could provide mastery goals 
for an EMR program* A fairly good assessment of program effectiveness 
can be made even in the absence of a good design, if program results 
can be compared to reasonable goals. 
5* Make the evaluation theory-based. A good approach to assessing the 

results of special < programs is to do a theory-based evaluaiion. 

This is an evaluation that focuses on program implementation, holding 
the staff accountable for operating the program they have promised. 
The theory-based evaluate n first asks: On what theory of instruction, 
theory of learning, psychological theory, or philosophical point-of view 
is the program based? In other words, what activities does the staff view 
as critical to obtaining good results toward which the program aims? 
Detailed questioning of the staff makes explicit the model, theory, or 
philosophy that the staff is trying to implement. Once you know the 
staffs intention, your job will be to ascertain if activities that are 
specified by the theory are being effectively operationalized and imple- 
mented. Of course if you decide to do a theory-based evaluation, the 
existence of planned activities must be documented through objectively 
collected evidence, not just through testimonials. If you can show in 
your evaluation that the elements which the theoiy specifies as neces- 
sary for goal attainment are present, then you have shown that the 
program has taken an effective step toward goal achievement. If the 
theory ?s correct, goals should be reached eventually. 



Anderson, S. B. (Ed.). New directions in program evaluation. San Fran- 
cisco: Jossey-Bass, Inc., 1978. 

House, E. R. (Ed.). School evaluation: The politics and process. Berkeley: 
McCutchan, 1973. 

Morris, L. L, & Fitz-Gibbon, C. T. Evalu?tc: a handbook. In L. L. Morris 
(Ed.), Program evaluation kit Deverly R-u: Sage Publications, 



Popham, W. J. Educational evaluation. Englewood Giffs, NJ: Prentice- 
Hall. 1975. 

Struening, E. L, & Guttentag, M. Handbook of evaluation research. Vol 
1. Beverly Hills: Sage Publications, 1975. 

Worthen, B. R., & Sanders, J. R. Educational evaluation: Tlteory and 
practice. WortKngton, Ohio: Charles A. Jones Publishing, 1973. 



For Further Reading 



1978. 





HOW TO MEASURE PROGRAM IMPLEMENTATION 
(DRAFT) 

Revision Author: Jean King 

December 6, 1985 



Table of Contents 

Chapter 1. Measuring Program Implementation: An Overview 

Chapter 2. Questions to Consider in an Implementation Evaluation 

Chapter 3. How to Plan for Measuring Program Implementation 

Chapter h. Methods for Measuring Program Implementation: Records 

Chapter 5. Methods for Measuring Program Implementation: Self-Reports 

Chapter 6. Methods for Measuring Program Implementation; Observations 

Appendix A An Outline for an Implementation Report 



ERLC 



15/ 



Chapter 1 

MEASURING PROGRAM IMPLEMENTATION: AN OVERVIEW 



Chapter 1, Page 2 



How i)o Measure Program Implementation is one component of t e 
Program Evaluation Kit , a set of guidebooks written primarily for 
people who have been assigned the role of program evaluator. The 
evaluator has the often challenging job of scrutinizing and describing 
programs so that people may judge the program's quality as it stands 
or determine ways to make it better. Evaluation almost always demands 
gathering and analyzing information about a program's status and 
sharing it in one form or another with program planners , staff , or 
f unders . 

This book deals with the task of describing a program's 
i mplemen tat ion — i.e., how the program looks in operation^ Keeping 
track of what the program looks like in actual practice is one of the 
program evaluator 's major responsibilities because you cannot evaluate 
something well without first describing what that something is. If 
you have taken on an evaluation project, therefore, you will need to 
produce a description of the program that is sufficiently detailed to 
enable those who will use che evaluation results to act wisely. This 
description may or may not be written. Even if delivered informally, 
hove , it should highlight the program's most important 
characteristics, including a description of the context in which the 
program exists — its setting and participants — as well as its 
distinguishing activities and materials. The implementation report 
may also include varying amounts of backup data to support the 
accuracy of the description* 



158 



Chapter 1, Page 3 
The overall objective of this book is to help you develop skills 
in describing program implementation and in designing and using 
appropriate instruments to generate data to support your description. 
The guidelines in the book derive from three sources: the experience 
of evaluators at the Center for the Study of Evaluation, University of 
California, Los Angeles; advice from experts in the fields of 
educational measurement and evaluation; and comments of people in 
school, system, and state settings who used a field test edition of 
the book. How To Measure Program Implementation has three specific 
purposes : 

1. To help you decide how much effort to spend on describing 
program implementation 

2. To list program features and activities you might describe in 
a prograa implementation report 

3. To guide you in designing instruments to produce supporting * 
data so that you can assure yourself and your audience \;hat 
your description is accurate 

The book has six chapters. Chapter 1 discusses the reasons for 
examining a program's implementation. Chapter 2 provides a lint of 
questions that might be answered by an implementation evaluation. 
Chaptex-s 3 through 6 comprise the "How to Measure" section of the 
book. Chapter 3 discusses how plan an implementation evaluation, 
followed by three methods chapters devoted to an examination of 
existing records, self-report measure? (questionnaires and 
interviews), and observation techniques. 

Wherever possible, procedures in the "how to" sections are 
presented step-by-step to give you maximum practical advice with 
minimum theoretical interference. Many of the recommended procedures, 



153 



Chapter 1, Page h 

however, are methods for measuring program implementation under ideal 
circumstances. It is no surprise that few evaluation situations in 
the real world match the ideal, and, because of this, the goal of the 
evaluator should be to provide the best information possible * You 
should not expect, therefore, to duplicate step-by-step the 
suggestions in this book . What you can do is to examine the 
principles and examples provided and then adapt them to your 
situation, whatever the evaluation constraints, data requirements, and 
report needs. This means gathering the most credible information 
allowable in your circumstances and presenting the conclusions so as 
to make them most useful to each evaluation audience. 

Whj Look at Pro gran Implementation? 

One essential function of eveiy evaluation is answering the 
question, "Does the combination of materials, activities, and 
administrative arrangements that comprise this program seem to lead to 
its achieving its objectives?" In the course of an evaluation, 
evaluators appropriately devote time and energy to measuring the 
attitudes and achleverent of program participants. Such a focus 
reflects a decision to judge program effectiveness 1 looking at 
outcomes and asking such questions as the following: What results did 
that program produce? How well did the participants do? Was there 
community support for what went on in the program? Every evaluation 
should consider such questions. 

But to consider only questions of program outcomes may limit the 
usefulness of an evaluation. Suppose evaluation data suggest 
emphatically that the program was a success. "it workedl" you might 

1K0 



Chapter 1, Page 5 



say. Unless you have taken care, however, to describe the details of 
the program's opei -itions , you may be unable to answer the question 
th*\t logically follows your judgment of program success , the question 
that asks, " what worked?" If you cannot answer that question, you 
will have wasted effort measuring the outcomes of events that cannot 
be described and must therefore remain a mystery. Unless the 
programmatic black box is opened and its activities made explicit, the 
evaluation may be unable to suggest appropriate changes. 

If this should happen to you, you will not be alone. As a matter 
of fact, you will be in good company. Few evaluation reports pay 
enough attention to describing the program processes that helped 
participants achieve measurable outcomes. Some reports assume, for 
example, that mentioning the title and the funding source of the 
project provides a sufficient description of program events. Other 
reports devote pages to tables of data (e.g., Types of Students 
Participating or Teachers Receiving In-service Tx*aining by Subject 
Matter Area) on the assumption that these data will adequately 
describe the program's processes for the reader. Some reports may 
provide a short, but inadequate description of the program'3 major 
features (e.g., materials developed or purchased, teacher and student 
in-class activities, employment of aides, administrative supports, or 
provisions for special training). Ax'ter reading the description the 
reader may still be left with only a vague notion of how often or for 
what duration particular activities occurred or how program features 
combined to affect daily life at the program sites. 

To compound the problem of omitted or insufficient description, 




ERIC 



Chapter 1, Page 6 

evaluation reports seldom tell where and how information about program 
implementation was obtained. If the information came from the most 
typical sources — the project proposal or conversations with project 
personnel — then the report should describe the efforts made to 
determine whether the program described in the proposal or during 
conversations matched the program thax actually occurred. Few 
evaluations give a clear picture of what the program that took place 
actually looked like and, among those few that do provide a picture of 
the program, most do not give enough attention to verifying that the 
picture is an accurate one* 

It could be argued that this lack of attention to detail and 
accuracy is Justifiable in situations where no one wants to know about 
the exact features of the program* This, however, is a bogus argument 
because you simply cannot interpret a program's results without 
knowing the details of its implementation. For one thing, an 
evaluation that ignores implementation will add together results from 
sites where the program was conscientiously installed with those from 
places that might have decided, "Let's not and say we did." If 
achievement or attitude results from the overall evaluation are 
discouraging, then what's to be done? This scenario typifies a poor 
evaluation study, but unfortunately, it describes ^mafiy large-scale 
program evaluations froa the '70s, including a few of those most 
notorious for showing "no effect" in expensive Federal programs (e.g., 
the 1970 evaluation of Project Follow-Through). 

What io more, ignoring implementation — even when a thorough 
program description is not explicitly required — means that information 

162 



Chapter 1, Page 7 



has been lost. This information, if properly collected, interpreted, 
and presented, could provide audiences now and in the future a picture 
of what good or poor education looks like. One important function of 
evaluation reports is to serve as program records. Without such 
documentation educators may continue to repeat the mistakes of the 
past • 

Why look at program implementation? Two things should be clear by 

now : 



~ Description, in as much detail as possible, of the 
mater* lis, activities, and administrative arrangements that 
characterize a particular program is an essential part of its 
evaluation; and 

- An adequate description of a program includes supporting 
data from different sources to insui % thoroughness and 
accuracy. 



How much attention you choose to give to im±. \ementat ion in your own 
situation, then, will substantially affect the quality of your 
evaluation. A detailed implementation report, intended for people 
unfamiliar with ^h program, should include : fctention to program 
characteristics and supporting data as described in Table 1. 



A quick look in Chanter 2 at the liet of possible questions for an 
implementation evaluation will show you that assembling information 
and writing a detailed implementation report about even a small 
program could be an impossible Job for one person who must work within 
the constraints of i&ime and a budget. To help you in such a 



[ Insert Table lj 



What and How Much To Describe? " 





Chapter 1, Page 8 

situation, the remainder of the chapter poses some questions to focus 
jour thinking about what to look at, measure, and report. Considering 
these questions before you make decisions about measuring 
implementation should help insure that you spend the right amount of 
time and effort describing the program and use the measures most 
appropriate to your circumstances . 

Before planning data collection about program implementation, you 
will need to rc*ake two decisions: 

1. Which features of the program is it most critical or valuable for 
me to describe ? This may amount to deciding which questions in 
Chapter 2 to use. Your ansver will depend, in pa. , on how much 
time and money you have. It will also be affected by your role 
vis-a-vis the staff and the funding agency, the announced major 
components of the program, and the amount of variation allowed by 
its planners. 

2. How much and what Kind of data will be necessary to support the 
accuracy of the description of each program characteristic ? 
Decisions about backup evidence will determine whether your report 
simply announces the existence of a program feature or offers 
evidence to support the description you have written. This 
decision will also be constrained by time and money, as well as by 
your own Judgments about the need for corroboration and the amount 
of variation you have found in the program. 

If you feel that your experience with evaluation or with the program, 

the staff, or the funding agency is sufficient /to allow you to make 

these decisions right now, then process to Chapter 2 and being 

planning your data collection. 

If you do not yet feel ready, the four questions that follow will 

give you further guidance toward making decisions about wLat to look 

at and how to back up your report. Thepe questions relate to the 

fo^'owing issues: (1) deciding whether you need to document the 

program or work for its improvement; (2) determining the most critical 

features of the program you are evaluating; (3) finding ^ut how much 



161 



Chapter 1, Page 9 



variation there is in the program; and (4) deciding how much and what 
type of supporting data is needed. 

Question 1. What Purposes Will lour Implementation Study Serve? 

This question asks you to consider your role with regard to the 
program. Your role is primarily determined by the use to which the 
implementation information you supply will be putr The question of 
use w* 11 override any other you might ask about program 
implementation • 

If you have responsibility for producing a summary statement out 
the general effectiveness of the program, then you will probably 
report to a funding agency, a government office 5 or s )me other 
representative of the program's constituency. You may be expected to 
describe the program, to produce a statement concerning the 
achievement of its intended goals, to ncte unanticipated outcomes, and 
possibly to make comparisons v*+h an alternative program. If these 
taskb resemble the features of your job, yt « have been asked to assume 
the role of suma a t i^e evaluator. 

On the other hand, your* evaluation *,ask may, charact eri ze you as a 
helper and advisor to the program planners and developers. During the 
early stages of the program's operations, you may b^ called on to 
describe and monitor program activities, to test periodically for 
progress in rchievement or attitude change, to look for potential 
r^oblems, and to identify areas where the program needs improvement. 
You may or may not b*5 required to produce a formal report at the end 
of your activities. In ti is situation, you are a trouble-shooter and 
a problem solver, a uerson whose overall task is not well-defined. If 



ERIC 




Chapter 1, Pa£ * 10 
these more loosely-defined tasks resemble the features of your job, 
you are a f oraat ive evaluator. Sometimes an evaluator is asked to 
assume both rol.es simultaneously — a difficult and hectic assignment, 
but one that is usually doable . 

While concerns of both the forr.^.tive and summative evaluator focus 
on collecting information and reporting to appropriate groups, the 
measurement and description of program implementation within each 
evaluation role varies greatly, so greatly that different names are 
used to characterize the two kinds of implementai|ton focus, 
description of program implement^i|ton for t"mmatlve evaluation _s 
often called program documentation. A documentation of a program is 
its official description outlining the fixed critical features of the 
program as well as diverse variations that might have been allowed. 
Documentation connotes something well-defined and solid. 
Documentation of a program, its summative evaluation, should occur 
only after the program has had sufficient time to correct problems and 
function smoothly 9 

On the other hand, description of program implementation for 
formative evaluation can be called program monitoring oi evaluation 
for program improvement. Monitoring connotes something more active 
and less fixed than documentation. The more flu^d connotation of 
monitoring reflects the evolving nature of the program and its 
formative evaluat 4 ->n requirements. The formative evaluator f s Job is 
not only to describe* the program, but also to keep vigilant watch over 
its development and to call the attention of the program staff to what 
i? happening. Prqgr- m monitoring in formative evaluation snould 



16b' 



Chapter 1, Page II 



reveal to what extent the program as implemented matches what it* 
planaers intended and should provide a basis for deciding whether 
parts of the program ought to be improved, replaced, or augmented. 
Formative evaluation occurs while the program is still developing and 
can be modified on the basis of evaluation findings. 

Measuring Inplementat ion for progran documentation 

Part of the task of the summative evaluator is to record, for 
external distribution, an official description of what the program 
looked like in operation. This program documentation may be used for 
the foj lowing purposes: 

1. Accountability . Sometimes the expected outcomes of a program, 

\ch as heightened independence or creativity among learners, are 
intangible and difficult to measure. At ~ther times program 
outcomes may be remote and occur at some time in the future, after 
the program has concluded and its participants have moved on. 
This kind of outcome, concerned, for instance, with sr^h matters 
as responsible citizenship, succeas on the job, or reduced 
recidivism, cannot be achieved by the participants during the 
program. Rather, the program is intended to move its participants 
toward achievement of the objective. In such instances, where 
judging the program completely on the basis of outcomer might be 
impractical or even urfair, program evaluation can focus primarily 
on implementation . Program staff can be held accountable for at 
least providing materials and producing activities that should 
help people progress toward future go-Is. Alternative school 
programs, retraining programs within a company, programs 



ERLC 




Chapter 1, Page 12 

responding to desegregation mandates, and other programs involving 
shifts of personnel or students are examples of cases where 
evaluation might well focus principally on implementation. Though 
these programs might result in remote or fuzzy learning outcomes, 
the nature of their proper implementation can often be precisely 
specified, 

^Of course, you might need to measure implementation for 
accountability purposes in any case. Even when a program's 
objectives are immediate and can be readily measured, it it likely 
that the staff will be accountable for some amount of 
implementation of intended program features. They will need to 
show, in other words, where the money has gone. This role of 
program documentation has been called the signal function, a sign 
of compliance to an external agency, a report that says "We did 
everything we said we were going to." While some may belittle 
this type of evaluation, its successful and timely completion is 
often critical to continued funding, and its importance should not 
be underestimated • 

Providing a lasting description of the program . The summative 
evaluator's written report may be the only description of the 
program remaining after it ends. This report * should therefore 
provide an accurate account of the program and include sufficient 
detail so that it can serve as a basis for planning by those who 
may want to reinstate the program in some revised form cr at 
another site. Such future audiences of your report need tc know 
the characterist icn of the site and the sorts of activities and 



168 



Chapter 1, Page 13 
materials that probably brought about the progrem's outcomes. 
3* Providing a list of the possible causes of the program's effects * 
While such cases are unusual, a summative evaluation that uses a 
highly credible des ign and valid outcome measures constitutes a 
research study. It can serve as a test of the hypothesis that the 
particular set of activities and materials incorporated in the 
program produces good achievement and attitudes. Here the 
summative report about a particular program has something to say 
to policy makers about programs using similar processes or aiming 
toward the same goals. The activities and materials described in 
the evaluator's documentation, in this case, are the independent 
or manipulated variables in an educational experiment . 

The development of evaluation thinking over the past twenty 
years has led away from the notion that the quantitative research 
study is the only and ideal form for an evaluation to take. In 
cases where variables cannot be easily controlled or where 
creating a control group will deprive individuals of needed 
services or training, evaluators should neither J iment their fate 
nor demean the project. But in those few cases where an evaluator 
has the opportunity to design and conduct a research study in the 
traditional sense, the opport mlcy sftould not^ be' wasted. 
Knowing the uses to which your documentation will be put helps you 
to determine how much effort to invest in it. Implementation 
lnformatic llected for the purpose of accountability should focus 
on producing the required "signals" by examining those activ ;ies, 
administrative changes , or mater i a Is that are either spec if ically 





Chapter 1, Page Ik 
required by the program funders or h p ve been put forward by the 
program's planners as major means for producing its beneficial 
effects . 

The amount of detail with which you describe these characteristics 
will depend, in turn, on hov precisely planners or funders have 
specified what should take place. If planners, for example, have 
prescribed only that a program should use the XYZ Reading Series, 
measuring implementation will require examining the extent of use of 
this series. If, on the other hand, it is planned that certain 
portions of the series be use! with children having, say, problems 
with reading comprehension, then describing implementation will 
require that you look at which portions are being used, and with whom. 
You will probably need to look at test scores to insure that the 
proper students are using XYZ. The program might further specify that 
teachers working in XYZ wi, problem readers carry out a daily 
10-minute drill, rhythmically reading aloud, in a group, a paragraph 
from the XYZ story for the week. If the program uas been planned this 
specifically, then your program description will probably need to 
attend to these details as well. As a matter of fact, attention to 
sp*»c:.fic behaviors is a good idea when describing any program where 
you see certain behavior occurring routinely. Pnogrsm descriptions at 
the level of teacher and student behavior help readers to visualize 
* r hat students have experiences, giving them a good chance to think 
about what it is that has helped the students to learn. 

If accountability is the major reason for your summative 
evaluation then yo-u must provide data to show whether — and to what 



170 



Chapter 1, Page 15 



extent — the program's most important events actually did occur. The 
more skeptical your audience, the greater the necessity for providing 
formal backup data . Concerns about the skeptical audience are 
elaborated in later questions in this chapter. [MAKE SURE THEY ARE.] 

If you need to provide a permanent record of program 
implementation for the purpose of its eventual replication of 
expansion, try to cover as many as possible of the program 
characteristics listed in Chapter 2. The level of detail with which 
you describe each program feature should equal or exceed the 
specificity of the program plan, at least when describing the features 
that the staff considers most crucial to producing program effects. 
If additional practices typical of the program should come to your 
attention while conducting your evaluation, you should include these. 
You will need to use sufficient backup data so that neither you nor 
your audience doubt the accuracy or generality of your description. 

When describing implementation for the purposes of accountability 
and leaving a lasting record of a program, the data you collect can be 
fr.'rly informal, depending on your audience's willingness to believe 
you. Ycu might talk wj * staff members, peruse school records, drop 
in on class sessions, or quote from the pxOgram pi osal. 

In cases where tae reason for reaouring implementation involves 
research or where there is potential for controversy about your data 
and conclusions, you will need to back up your description of the 
program through systematic measurement, such as coded observations by 
trained raters $ examination of program records, structured interviews, 
or questionnaires.' Carefully planned and executed measurement will 



ERIC 




Chapter 1, Page 16 

allow you to be reasonably :ertain that the information you report 

trruly describes the situation at hand. It is important that the 

evaluator produce formal measures in cases where he himself wants to 

verify the accuracy of his program description. It is essential that 

he measure if he thinks he will need to defend his description of the 

program, that is, if he might confront a skeptic. An example from a 

common s ituat ion should illustrate this . 

[ INSERT EXAMPLE HERE j 
Measuring implementation for program improvement 

As has been mertioned, the task of the formative evaluator is 

typically more varied than that of the summative evaluator ♦ Formative 

evaluation involves not only the critical activities of examining and 

reporting student progress and monitoring implementation; it also 

often meas|n assuming a role in the program's planning, development, 

aDd refinement. The formative evaluator 's re^sonsbilit ies 

specifically related to program implementation usually include t> 2 

following : 

1 . Insuring, throughout pi-ogram development, that the program's 
official description is kept up-to-date, reflecting how t v e 
program is actually being conducted . While for small-scale 
programs, this description could be unwritten and agreed upon by 
the few active staff members, most programs should be described in 
a written outline that is periodically 1 pdated. A.i outline of 
program processes vritten before implemetat ion is usually called a 
program plan. Recording what has taken place during the program's 
implementation produces one or more formative implementation 
reports. The task of providing formative implementation 



172 



Chapter 1, Page 17 
reports — and often insuring the existence of a coherent program 
plan as well — falls to the formative evaluator. 

T>*e topics discussed in the formative report could coincide with 
the headings in the implementation report outline in the Appendix. 
Tne amount of detail in which each aspect of the program is 
described should match the level of detail of the program pica . 
In many situations, the formative evaluator finds his first task 
to be clarification of .ae program plan. After all, if he is to 
help t'je staff improve the program as it develops, he and they 
need to have a clear idea at the outset of how it is supposed to 
look. IT you plan to work as a formative evaluator, do not be 
surprised to find that the staff has only a vague planning 
document. Unless the program relies heavily on commercially 
published materials wit'^ accompanying procedural guides, or the 
program planners are experienced curriculum developers, planners 
have probably taken a wait-and-see attitude abou many of the 
program's critical features- This attitude need not be 
bothersome; as long as it does not mask hidden disagreements among 
staff members about how to proceed, or cover up uncertainty about 
the program's objectives, a tentative attitude toward the program 
can be healthy. It allows the program to tak^e on the form that 
will work best. 

It gives you, however, the job of recording what does happen so 
that when and if summative evaluation takes place, it will focus 
on a realistic depiction of the program. An accurate portrayal of 
the program will also be useful to those who plan to adopt, adapt, 



17 J 



Chapter J, Page 18 



or expand the program in the future. The role of the evaluator as 
program historian or recorder is an essential cne ; as it i* ofxen 
the case that staff people s imply have no time for such luxuries . 
Even as simple a record as notes from meetings, arranged 
chronologically , can provide helpful information at a later date* 
Helping the staff and planners to change and add to the program as 
it develops . In many instances the formative evaluator will 
become involved in program planning — or at least i^u aesigning 
changes in the program as it assumes cleaner form. How involved 
she becomes will depend on the situation. If a program has been 
planned in considerable detail, and if planners are experienced 
and well versed in the program's subject matter, then they may 
want the formative evaluator only to provi le information about 
whether the program is deviating from the program plan, 
Ctyfin the other hand, if planners are inexperienced or if the program 
was not planned in great detail in the first place, then the 
^valuator becomes an in/estigat i^e reporter. Her f ir: t job might 
be to find out what is happening — to see what is going well and 
badly in the program. She will need to examine the program's 
activities independent of ^ uidance from the plan, and then help 
eliirinat.p weaknesses and expand on the program' s~ good points. If 
this case fits your situation , use the list of implementation 
character 1st ics in Chapter 2 as a set of suggestions about what to 
look for or adopt the naturalistic approach described later. 
The formative evaluator's service to a staff that wants to change 
and improve its program could result in diverse activities. Two 




Chapter 1, Page 19 

of them are particularly important: 

a. The formative evaluntor could provide information that prompts 
the staff and planners to reflect periodically on whether the 
program that is evolving is the one they want to have. This 
is necessary because programs installed at a particular site 
practically never look as they did on paper — or as they did 
when in operation elsewhere. At the same time, staff and 
planners will be persuaded to reexamine their initial thinking 
about why the processes they have chosen to implement will 
lead to attaining their objectives. Careful examination of a 
program's rationale, handled with sensitivity to the program's 
setting, could turn out to be the greatest service of a 
formative evaluator. The planners should have in mind a 
sensible notion of cause and effect relating the desired 
outcomes to the program-as-envis ioned . Insofar as the 
program-as -implemented and the outcomes observed fail to match 
expectations, the program's rationale may have to be revised. 

b. Controversies over alternative ways to implement the program 
might lead the formative evalautor to conduct small-scale 
pilot studies, attitude surveys, or experiments with 
nfc-*ly-developed program materials and activities. Program 
planners, after all, must constpntly make decisions about how 
the proRrfun will look. These decisions are usually based only 
on hunches about what will work best or will be accepted most 
readily. For instance: Should all math instruction take 
place in one sessions or should there be two sessions during 



175 



Chapter 1, Page 20 
the day? How much discussion in the vocational education 
course should precede field trips? Hov much should follow? 
Will practice on the Controlled Reading Machine produce 
res ults that are as good as thos e obtained when children tutor 
one another? How much additional paperwork will busy 
instructors tolerate? How much worksheet activity can be 
inc luded in the French course without detracting from 
students' chances of attaining high conversational fluency? 
These are good and reasonable questions that cc-n be answers by 
means of quick opinion surveys or short experiments, using the 

methods described in most texts on the topic of research 

3 

design. A short experiment will require that you select 
experimental and control groups, and then choose treatments to 
be g^iven to these groups that represetjn the decison 
alternatives in question. These short studies should ? ast 
long enough to allow the alternatives to show effects. The 
advantage of performing short experiments will quickly become 
apparent to you ; they provide credible evidence abcut the 
effectiveness of alternative program components or practices. 
At the same time , it must be remembered that the real world 
environment surrounding most evaluations B?akes even simple 
experiments difficult +.o conduct . 
When measuring implementation for program improvement, the form of 
evaluation reports can and should vary greatly . Informal 
conversations with an influential staff member may have more effect 
than a typewritten .report, and particularly a report loaded with 



176 



Chapter 1, Page 21 



statistical tables: 



Periodic meet ings to discuss program problems and 



issues may update administrators and teachers, forcing them to think 
about the activities in which they are engaged far better than even a 
short written document could . On 3 wellknown e valuator has gone so far 
as to have program personnel place bets on the likely outcomes. of data 
analysis so they will have a vested interest in the results. 

Whether you work as a summative or a formative evaluator, you will 
need to decide how much of your implementation report can rely on 
anecdotal or conversational information and stixl be credible, and how 
much your report needs to be backed up by data produced by formal or 
systematic measurement of program implementation. If what yoa 
describe can make a difference to those who might use it foi any of 
the purposes mentioned, then your implementation report deserves all 
the time ana effort you can afford. 

Question 2. What Are the Program's Most Critical 
Characteristics ? 

Having determined the purposes — formative or summative (or both) — that 
your impleme r 4oat ion study will serve, your identification of the 
program 1 s critical features will help you further to determine two 



- The specific questions our evaluation will address 

- The T.eyr * of detail you should use in describing the ^program 
Threp features conmcn v o axi programs can form an initial outline of a 
prr b ram f B critical characteristics : context ; act ivl ties ; and 
"theory. ' You can begin to describe the program by outlining the 
elements of the program's context — th' tangible features of the 



things : 



ERIC 




Chapter I, Page 22 

program and its setting; 

The classrooms, schools, districts, or sites where the program 
has been installed 

The program staff — including administrators, teachers, aides, 
parent volunteers, secretaries, and other staff 

The resources used — including materials conjtructed or 
purchased, and equipment, particularly that purchased 
especially for the program 

The students or participants — including the particular 
characteristics that made them eligible for the program, their 
number, and their level of competence at the beginning of the 
program 

ThJse context features constitute the bare bones of the program 
and must be included in any summary report. Listing them does uot 
require much data gathering on your part, since they are not the sort 
of data t^at you expect anyone to challenge or view wi.h skepticism. 
Unless you have doubts about the deliver/ of materials, or you think 
that the wrong staff members or students may be participating, there 
is little need for backup data to support your description. 

Another part of the context you would do well to consider is not 
tangible , but may be essential to understanding program functioning. 
This is the political context ?nto which the program is set. It 
includes, for example, understanding what interest groups or powerful 
individuals are involved in the program, how funding was initially 
secured, the role of top managers, problems encountered in the 
program, and so forth. In some settings, none of this will matter; in 
otiier^, such information will allow you to target your evaluation or 
what can be usefully addressed. While such information is unlikely to 
appear in formal evaluation documents, only a naive evaluator operates 
without an awareness of the political context, and he does so at his 



178 



Chapter 1, Page 23 

and his evaluation's risl.. 

In addition to context features, the sec ^ oa to describe in 
looking for critical characteristics is thao ogran. activities. 

Describing important activities demands formulating aid answering 
questions about how the program was implemented, for example: 

k— What were the materials used? Were they useH as intended? 

What procedures vere pres?^ibed for the instructors to follow 
in their teaching and other interactions with students? Were 
these procedures followed ? 

In what activities were the participants in the program 
supposed >;o participate? Did they? 

Wnat activities were prescribed for other participants — aides, 
parents, tutors? Did they engage in them? 

What administrative arrangements did the program include? What 
lines of authority were to be used for making important 
decisions? What changes occurred in these arrangements or 
line:, of authority? 

Listing the salient activites intended to occur in the program 
will, of course, take you much less tii*e than verifying that they have 
occurred, and in the foi^ intended. Unlike materials, which usually 
s *y put and whose presence can be checked at prr..ct ic«Ily any time, 
program cctivites may be inaccessible once they, nave occurred if they 
were not consciously observed or rded. Counting them or merely 

noting their presence is therefore no small i,ask. Ijx addition, 
activities ara more difficult to recognize than content features. 
Math games, microcomputers, aides, and science materials from Company 
X are easily identified; bvc what exactly does the act of 
reinforcement or acceptance of a students cuJtural background look 
like when it is taking place? 

Occurrence of intangible activities such as reinforcement or 



179 



Chapter 1 Page 2k 
cultural acceptance cannot be simply observed and reported like an 
inventory of materials or a headcount of students. Ever* if they could 
be directly observed, could not possibly describe all of them. 

You vill have to choose which activities to attend to . Your choice of 
these activities will in large measure depend upon what your audience 
has said it needs to know in order to make informed decisions. 

Once context and activities are delineated, the third and often 
the most difficult program feature to determine is what can b called, 
f or war- of a better term, the program's "theory." Every program, no 
matter how small, operates with some notion of cause and effect, that 
is, with a theory. Examples are numerous : If tee' age parents learn 
parencing skills, their children will eat more nutritiously; if 
bilingaal students receive r ?inf orcement in their native language, 
th^ir cognitive skills and self-concept will develop normally; if 
losgterm employees unuergo technical education, th^ir job productivity 
will increase. Some programs (e.g., Montessori schools, E.S.T., or 
Camp Hill Villages) are systematically designed to implement the 
tenets of an explicitly stated model, theory, or philosophy. Others 
evolve their own theories, combining comr.on sense, practice, and 
theoretical tenets from a * ariety of sources, The job for the 
evaluator ia to discover this theory in order to s bet"ter understand how 
the program is supposed to work and its critical characteristics in 
the eyes of program planners and sta/f. 

On paper it sounds eacy to describe a program's context, key 
activities, *nd "theory," but when you try to do it, critical details 
may prove elusive.. Three sources of information should help you 



ISO 



Chapter 1, Page 25 

decide what your evaluation should examine: 

1. The program propobal or plan 

2. Opinions of program personnel y experts , and yourself, based on 
assumptions about what makes an educational program work 

3c Your own observations 

^Picking out critical program features from the plan or proposal 

Some program proposals will come right out and list the program's 
most important features, perhaps even explaining why planners think 
these materials and activities will bring about the desired outcomes. 
But many will not, although if you look carefrMy, you nu find clues 
about what is considered important. For instance, most proposals or 
documents describing a program will refer over and over to certain key 
activities that should occur* As a rule of thumb, the more frequently 
an activity is cited, the more critical someone considers it to be for 
program success. You may therefore decide that activities repeatedly 
mentioned are critical program components to which the evaluation must 
attend . 

The program's budget is another index to its crucial features. As 
another rule of thumb, you may assume that the larger the budgeted 
dollar or other resource expenditure, such as staffing level, for a 
particular program feature — activity, event, material, or 
conf iguratiwU of program elements — the greater its presumed 
ccntr 4 Vution bo program success. Taken together, these two planning 
elements- frequency of ciration and level of expenditure or 
ef f urt--can provide some indication of the program's most critical 
components . 



ERIC 



/-24 



Reiving on the pmgutn f 1 ' Mt ^igcvii »ns .ihoiil u/-</' K t«i In- 
• lest I ' 'X* i» i ten ill tit a pMin« - h \ imm \ hu h in i;'| ,ri Mcb v 1 Mir 
implementation evaluation \>i t" />h m< nt<in< >n t \alit ,n<>n h lt \fl luivtlv 
on tin p><>vh:,ti plan will in\ . >/. ' t <'t!e< tiny Jaij tinh tvtitwh tf,> r\tn" • > 
wiui h the a in till iH t inn* \ ■:./*■'< ■/ / i <-< /•■!/» - '( • </s •///<'■•. '«•(/ and it 
thc> diiJ not ml an as planned what happened instead Dcsuiplimi i»l a 
program lioin thi> point . \) is the knul mos| otter done .nul lor ,u> 
tindcrst jiulahlc icasoii H [ . , <wdcs the simplest means bv wniJi the 
i valuator can decide whkh .ulh Hks to It m »L at 



I \arnpk A irioup o! hriloi .mil s<u*iuc k'ailnTS \\r ,\ proput.il In 
r Ik* Male l«»r i lew (hoiis.uul «l<«|| m m .i^nrnhk' .1 pcpon.'d h\ jmimiv and 
\c\ fjik.it I'M* loursi i««r K«»"ik K ii \ \ lm*h \ihonk Ilk* pi" uin wa*" to 
he hjscil lanvl* i»n piirvha-i'd audio- is'ul material Ilk' M.tle d- 
nalo* who LNainimd tin. pn^rani relied luawh on hie on-'iiijl proposal 
as a pr«»i*ram lUunpi.-r In iniiipletr the doi iimenlation section i-l his 
sunnnalivt' report he Minpb unt< d Ilk proi-ram's oIIki.iI d'.^nphon 
olwrvi'il iri»*oi*aih in huak mfiMsicnuis and diur* p.iikks t»c- 
twceii the planned po'crain -t ;ii I i'u mu* that .tiUiall, oeuirr»*d 



ERIC 



| ; ven in the absence of a formal written program plan, documentation 
from the perspective of implicit planning can be done by interviewing 
program planner, and asking them to describe aeliutie* they feci are 
crucial to the program. You l.iii then proceed with the documentation ol 
the extent of occurrence ol these activities 

You might find the program plan ami even the pl**nuei> themselves, m 
some instances, in be disappointing mhiiu> of idea*: about what to look 
for. They might not describe proposed activities to the device ol specific- 
it) >ru 1 eel \ou need, -n thc\ might expiess grandiose plans engendeied 
bv initial enthusiasm, or in espouse to proposj guidelines fro.il the 
<*undmg source which were themselves o\eH\ ambitious. I; is possible, as 
well, that .he program na% tun nan planned in anv spot 1 Ik wa\ 

llov, then will v«m document 'he n»ogiam it ti*i wluiew eason there 
is no plan which details activities th.it are specilu lejbibl • and ambient 
throughout 1 In this ease. \ou have two opMoik \ou can relv 01 what 
theory and e.xpeiiencei peoph s,»\ should He in tlk prt>yram 01 \ou can 
take the point of wcw of a re^prft^exnamralistn >hsenir* and simply 
vjlch rhe program operating to dis:o\er wlut seems to be the progiam's 
trttical fenrures 

Relying or^ opinions of program personnel, experts, and yourself 
to select critical prograa features 

If >ou haw KasMM to believe that nmmic teatnie /; »/ mcnrtovdil in the 
progiam's planning documents might he necessary h-r p'vgunt snccess 
then f oot; }«t it CommoiiU nninenti^ned but critical pru-'rani cha r ktcris 
tics, tor instance ar-; ret it tifMt^" npt'iti* •? t*t k tunc J i*v**tnutit*n t 
adequate nine < n Ms^^amieis ^^i-tfik tit' mj! progi 11m •* -.-em** spend 
a lot ul tune dect« f «:3g what 1 ■ 'c.kIi md «n what ^ecpk;k( l*Mt o|'»n 
overlook sludekts' need to 1epe.1t and *i»io\ ilk iiil<»nnati"H Miev ha\e 
received, 4**vC^ri' *c*J *\l+€* i-ki <<* MiWf>/.« 

Whether nr not the kinds ol characteristics i»r pto<eiatn ictivities men- 
tioned abo\e aie eiitied in \**w situ n'^n \> i u s/// w//./ < t*m\ttUhitt*>H 
to features not spvnlinillv uted til tin wutt'it phtn. \\h \e ptesetue m 
ahsence mwiu he Hinted to pn"ji,.m *«<'\\ >>r Juttinc It \ou aie a 
formative t valuator it t« in fact.ymr ie r poiisibilttv U* hung these matters 
to the attention ol the stall You might incidentally. d.^c« er a feature o| 
t*ic program that someone think** could actnaflv make it lad U> all means, 
pay attention to this kind ol inhumation l acking up v «ur description 
with data 

Tr ly 2 BEST COPY 



1-37 



Fxamplc. Mr Walker, the director .ind <r> jarto lonn.itixc e\jl»utor ol 
in-<crvKC training: proeraim al i tirmcTMt\-b:i<cii Tc.uh^r Ct.iter no- 
ticed ttia t wine district* sending *cachrr<i .dlowed them Ireo chmcc oi 
courses Others, bvhevmi tint in «emu 'ramme should follow ,1 theme, 
encouraged, teacher* to tske courses within a single area sa\ elemen- 
tal mail., or afieeuu education 

'Though M »e leai'er Center it*ell made no recommendations ib.mt 
what tour-v- should be pursue; 1 . Nfr Wjlker decided that the ' theme 
vc;<ns no-themc" i"i<t*«>r nl leather training lnieh! haw an ettcct -mi 
teachers* overaM assessment oi the value ot their m-sem-e experience* 
Mc decided to describe the * nurse "I s'uih of the nou 1 ol 
teachers at the C enter mh 1 scpai i»eh anah/c the eroup>' resp.»nve< to 
an aitit i le que* amtnaire 

\s Mr Walker c'p' Cicd. u k'km alio*.* trainim* inllowed i theme 
e\pre r .s<d greater enthini ism ,il»i«ui tlr leather ( entc r Since hecouhl 
find no explanation lor the ditli'ciite in enthusnsin between tne two 
iroups other than the thematic character o! one group's proeratll. Mr 
Walker recommended that the Centc itself "iicourjcc thcnatic in-ser- 
vice Mud\. Ih used In"; descriptions n< 'fie courts ol <tud\ "1 the 
teacliers in *he tncme *Tmip as a sci models the Center mieht totlow 



To the exten* that you base your choice of what to look for on a 
set of assumptions about what works in education, you are conducting 

what could be called a "theory- bas ed" evaluation. 

Mr Walker in the example abo\e. 
worked frun the i.ith* r rnd:mentar\ hut wnliahie theory that education 
that Minus i prm.Tim n| >tud> is mor ,kcK to be perceived b\ the 
student as .jluahlc His evaluation was at le:ist paiih thcorv-bJ»ed bcc«ii>c 
he u-ed a thoor\ to tell htm wh:it to lo*'k at 

Lxjniintng program implementation m thcoiyh ised evaluatnu pr.es 
your >tudv a point *f view toward lite piogram similar to the "tic \<"J 
assume when basing implementation measurement on the proposal >ou 
begin with i prr\t upturn o| what effective pm<jum activities might look 
hke The prescription fiom the the<>iv based perspective hovvcci comes 
not Irotn a written Van. but Irom a ilicoix 

A thcotv -Km! tmplrnienl.iMon evaluation is e i pe\ l|M\ ippr^pii iK Jut 
looking ,il i >vboo| tuuiii. n thai iv built on .i w<<<A ' U leachmi: hchasiu" 
tfict»\ of learning development or human behavior or pli,htst'ph\ con 
eerninvf children schools, or organi/at'ons The specific' prescriptions oi 
manv %uth mode's and thcotics are l.unihar to most people wot king in 
education.^ 



Examples ol <onio ol th::*e models are 

• Behavior modification and various applications ol icmf or Cement 
theorv to instruction and classroom discipline 

• Piapet's theorv ot cogmh'.e development and other n.odels of how 
children learn concepts 

® Open-classroom and tree-school models sue 1 as tlio^c put forth bv 
writers in education in the 1%0's ^ 

• Fundamen tal -school UnZPism ': s$lls ni< dels wtwUtseeK to reinstall- 

* s 

traditional American classroom practices 

• Models of orcam/ations that prescribe arrangement* and procedures 
tor effective nanagement 

- Approaches to teaching critical thinking skills that encourage the 
uee of higher level questioning techniques 




BEST COPY 

183 



A prog" Jin identified with an> ol these points of view nius; >ct tip roles 
and procedures consistent with the pai ucuKir thcoiv or value s\ siciu 
Proponents of open schools, lor mstince would a<:rce that j classroom 
reflecting their point of \ie\v should displav freedom of movement. mdivid- 
unhzation >f ■••Mtuction .uid curriiular ihoit.es nude In stu ents I;ach 
Uieorv. plulosophv »»> teaching model contends that particular activities 
are either worthwhile in and of themselves »m are the best \\a> to promote 
cerum desirable outcomes viensurmg implementation ol a theory-based 
program then, become* p matter of checking the extent to vvlnJi activities 
or organizational arrangements at the program sites reflect the rheorv 



Theories underlying programs may be intuitive and specific, as in 
Mr; Walker's example, or explicit and general. 



Cooiey and Lohnes have proposed a general model ol sJ-<>o| 
Iearn.ng5'ti>jt seems particularly useful .is a suiirc • of ideas :->r shut to 
look at when describing ,» program mtendeJ u, nath people >om^tlmig. 
The effectiveness ol school programs ni bringing about dc^irju learning, 
according t this model depends on four I «uor 

1. Learning opportunities Schools provide the Mini and pj<i<.e in which 
students may practice new skills, attend to sources A new information, 
or come in contact with models ol I .nv tu act 

2. Motivation Schools mtentionall 4 . manipulate rewards and punishments 
tiiat persuade students to nmsuc prescribed activities and attend to 
particular information. 

3. Structured presentation of activities, idea*, and information Schools 
attempt to or^-im/e and sequence what is presented, taiiormn it to 
students' abilities so as to make learning as painless and efficient as 
possible 

4. Instructional evints The school day is tilled with M>uai and interper- 
sonal contacts that promote le.umr^ The elimination m misunder- 



standings through dialogue, a te ichei's effective use ol student contn- 
butions in a class discussion, and .he personal attention and reassurance 
that prevent student discouragement are some examples oi instructional 
e\ents m this sense, 

Figure I shows a simple' dugum ol the (oolev/Lohncs mudu. Oppor- 
tunity mou\aiors. striMmc. and msiiiMioiul events change dw student 
from hi> level ot initial nerlm mam e the iiitenon peifoimance uesired 
for the program. I he nnpoiMnt ihmu about the Coolev /Lolmes niwdc! foi 
describing piogi.mi implementation is t n; ,,| these lorn jspcvK o| 
schooling un cntka' ica!i:,es -hat m ^..lu.uoi might wan. lo men- 
f, -»ii wi.tn describing ,> pi- ;uin 



" .iu 1 . ii, •! it is.' <;n"" n M»a' tSit!<.i ii phdosophn s about school 
and Ja v ->i'"iu< pp<u » *' c . » an h c U* vfiKd and umip.ucd iiueh using the 
.01.' Jmu » ' * ** i "t th» 'n«a^.\ i.ihifv model 7 hi lonkiuc a! piogr it 
.ior .in to o i. •> it jli^.i' ■ i S (jikvtioih sutl* as I he |<.Hov.i*it: 

i 

• f'/,- * 'imm Mm m.! »t. r i iK 'M»»sm«p i , r«'i:r mi oImiaMU'S « ,r iiailjhk- 
t«> 1 * r ^ 1 '* u .(hi''' 1 1 w ( i<. •||**iuJ '«> if.irnirt* *nil p;.Hii« hil itic faru' l 
nkil! ,J twiicrSf inuin.'t unu ilu! w>,?thii'»ns ittri.li * nppnrlitnrv , smvIi ,»s 
.iifcmljm.c Kirs* n- Miati -i.tK .Mi-! 'f u.i«'f»\ I mm rclei.ini !rni note icio *- 



BEST copy 

ERJC 184 



nc<. v.irv juo« <ilcs or anmnt <tmUnH* Did Icarmni* opportunities varj 
wtnuumtilh in duration .ind nature' h *o, »trc ditlcunics <|* iticd b> 'he 
program, f»r left to teachers or student* to determine ' 

• \!,>tirators Vcrc the tii.itcri.iU uipaMe n| maituaimnu <tiulcnt attention or 
interest > Was .1 rcinl"K«wni «ucm 'i^d ' Wl»-tl svsiems of regard or 
pu.,i<hmcnt were usei' .»"d ,\erc parent inched'* Did <ue^e« uifh motun- 
tion technique un across sitet J ttlint u-ndHions midit hate had a bearing on 
the Uiffcrcrt Ivvcl* < I motivation amone student^ * 

• Structure To what evtcnt were program ohjet li^e< ^r c ^ litciP W«re learning 
hierarchies i»scd to underlie the ciimeuhin' Ua< a coherent outline used' 1 llou 
much attention nj« paid to scqucneine of lessi'iis* Was there an internet* 
procrim to tciU- no-.cl subject mat'er. mo! wc*e 'caihcrs a\v.irc of the 
rationale behind pr.-ram scqucntinu' What til >rt< were nude to ensure that 
pro-rim nbift!»c< and mucin (uni wo'iJ K stm *Mc in terms nl MudcntV 

» baikgtound* a;. I Jhliiic^' 

• tntruuional A«"WJ Wlial interpcivmal u'f k»< were there that leaded to 
support student imohcmem 111 proiir.'in au.uiic* 1 Mow much personal nttcr 
lion dtd mdiMdual students rcicnc' V*as instruction <>ne-io-one ,>r <>ne-to- 
manv^ businesslike or InciidK*' I requerii or infrequent* Pnminlj between 
students and te ichcrs. students and students «»r student^ and aides'' 

Thc«>iv-huscil evaluation mmlit aK«» imw»Iu "in assessment the 
consistent') «»i Hie pi sttrani plan -Mill il»e mulct!, 1 lie* >r> 

In sumnumc evaluations based «»n a trediMe tcscanh design, yoti 
should noic J thc«»r\ -based evaluation -.an piovide an jctual test o| the 
theory's vahditv. (iiven the potential imparl julc. jiuI rarttx , of empirical 
validation of 1 tlt-or\ results of an m-up-h uhiJi has provided such 
validation should reported and diwmiii it-.' as .udely as possible 

Using observations and case studies to determine critical 
program features 

The evaluation literature in recent years has been charged by a 

debate over the value of methods that have been variously called 

qualitative, naturalistic, ethnographic, responsive, or aver "new 

paradigm. 11 While each of these terms has its own i per definition, 

in common usage they together describe evaluation techniques borrowed 

largely from anthropology and sociology «hat generate *ord3 as 

products, rather than numbers , Tne skills they demand an evaluator 

differ greatly from those required in the mere traditional 

quantitative approach, and, while it is beyond the scope of this bo ^1; 

to provide an indepth description of qualitative evftluecion methods, 



BEST COPY 

ERIC 1£5 



|-3o 



reference 3od how-to f s will be added wh*/« appropriate throughout the 

text. The use of observations and more detailed case studies to 

determine critical features of a program being evaluated is one such 

2 

place and will necessarily involve qualitative methods. 



It is possible for an evaluator with a qualitative mirdset to 
observe a program in operation with relatively few pr econcpetions or 
decisions about what tc look for. This 

strategy might hi* lIi<>\cii lor .1 n umber o! r» isnus h,r one thing. 
evaJuators consider t! the best was. ol dCHribmu a program.. Unhampered 
by preconceptions d pusmpimm, the resnuiiMvc/iiaiiualiUJC ui(|uirci 
might set his sights ,1 catching the (rue lliivnr ol a piograni. disco\cnng 
Hie unique set of elements that make it work and conveying them to the 
evaluation's audience Further, a naturalistic approach might he necessary 
inhere is no written plan tor the program von -ire evaluating and you lincl 
that one cannot be retrospectively constructed with a reasonable degree of 
consistency by the planners liven if there is a plan, it might be vague or, 

from your perspective, unrealistic to implement. Then again, \ou might 
discover tnat the program has been allowed sr muJi vai ration from mIc to 
site that cumnuyr' features % nc not apparent at IIim In an\ »f thc<c ca^es 
vou have the optior ol just observing , . A . t 

Implicit m your dceisum to use responsive methods arc two other 
decisions A 

1 To rel\ hcJ v :l\ yn data collection methods that "get close to the data."" 
usuai!> A muaJm ^) hsc 1 \ a t ion s m-f^r^c— * 

2 To concentrate on relatmg what you lound. 1 :;t her than comparing 
what was to what should have been. 

THs aru53 leave \ <>u up 111 the Jir at hru about wha! to look for 



Example, lie Silinnl d ,.\ a < r iiall citv dcudcu ihjt tit^rli schools 
should spend one vear ernphuM/inc lanyiiaee \rts. with particular 
t >eus on Miipro'.iiK students" unrinc skills The district's \ssistant 
Supcnnicmlc tat tor lurnuihim resisted 'he initial impulse to uesten jiH 
nnpleinciu a - Mum mi. distnuwiuV program. Instcjd. she decided ih.u 
cjcIi ieau»«»r *hi»ild tw dimscd t<» respond, in his o' her own \\a> . to 
i he haste decision to cmnhawc writmc Her rcasom—; was thai some 
leathers would .irr t at y»n\ meth-ifl* ;h,n rhe other tejehers tould use 
to ^er>one s jd' jtit.i. -students .ind teaciiei < .ihk»» T" keep track of 
uhat tsaJieo <\ero doing, howeu-r. si «> scheduled periodic lejchcr and 
* v uJcnl inter-icus »pd dropped in rijss sessions ! r eque«Ulv She 
wr ifc v lunettes tl« ,ati.o% vtossrvm prirtitrs she h:i I seen .-nd wiuth 
ictlcctcd the ispn itions and report* ol tenhcr* .» studcnti Her 
report denionsfr.it( -1 to the Hoard tin* c**cus <.»f its t nuriu decision, 
jnd iircit|-teii jnMim teachers pi aM»ie\i oed form it «er\cd j< j source 
o! .icv teachm;* ide.is 



ERIC 



tr J se BEST COPY 



This H 'fr fl ii no in i' im l m » l . t.itpih.it to v on Ihc c>.ilu,Ut i\ vignettes 
correspond to ho.v niusl people share tiiloniuf n m .md n t d ce il f*A * r t »j*flte t 
s+v^ evaluation in .« context that iv lnv horn controversy *»r *>kepi;cisin 
looks very much like what people umuiII^Jo^TIic dillcientc between this 
evaluation however, and a l'-r imhVrjpiuhivn evaluation is ri the qtiuhtv of 
the observations made tfftpnhtu cVvalujlors use methods I nun the social 
sciences- notably anthropology -to obtain corronotalion for their observa- 
tions and conclusions. Thcv have, in I act. developed a method lor con- 
ducting evaluations that follows jlia^jjMiaturahstic held studies. 

evaluation using 4 iVITum Ii mk mctlioik would lolluw .1 urciiariu 
something like this 

1 A p.ir'iLular program is lo he cvahi,.(C(l ltd le :ire numerous cs.one 
or inure sites is chosen tor study 

2. Tlic cvaluator rKcives activities .1 1 the site or sites chosen, perhaps even 
taking part in lit*- activities, bin liyuii! to influence the proyuun rou ne 
as little as possible Often, time constraints require the isc of "ml or- 
man is" -people who have ahcadv lieen observing things and who can b^ 
interviewed. 

3. "Pnuigh data collectuv, could lake the torni ol coded records' »ke those 
produced Mirough the standard observation methods described in Chap- 
ter 5. the i ijpuns i v^ tiaturahstic ohseiver more ottcn records whai he 
sees in the form of field not***. This choice ot recording method is 
motioned mainly bv a desire to av >id deciding too soon winch aspects 
ot the situation observed will be considered mov importnm 

4 The re^femt^cTuatirahstic observer shifts back and iorth between 
formal data collection, study ot recorded notes, and informal conversa- 
tion with the sulnects Gradually she produces a description of the 
events ami duett i»r indirect interpretation of t^cin. The report is 
usuallv an oral or written narrative, diough uaiuralisi'': studies vield 
tables, sociograms. and other numerical and eraphic summaries as well. 

Case studies (considered technical!) ) represent not so much a me*h*J 
as^chuije^f what ;o study. Case ->t itd\ icscarchers quite often follow 
na( in allelic methodology. The case stud\ worker in ?vaii|aUon_cJiooscs to 
examine closelv a particular case that is. a school, a cStttuuin/a particu- 
lar group, -^individual experiencing the program. Sometimes the program 
itself is "ihc case." Whereas the naturalist:.: observer or the more tradi- 
tionr 1 evaluaior might concentiaic nnlv on those experiences of. sa\ . a 
school, w fifcit are related to the p met .11 11 'he insv study c\aluator will 
usually be mtcw.-d m a broader ran»c ot even's ami relationships. If the 
school 1$ the subject of study, then the job is to describe the school. The 
case study method places tlie program witlnn the context of the nianv 
things wluch happen to the school, us suit, and its students over the 
course of the evaluation. One result ol this method, you can see. is to 
display the proportional influence of the pmgiam among the nivnid »thcr 
factors influencing the actions .md !e"hngs ot the people under study 
While case studies oltcn use nai malign melh d* presumably because >l 
the complexity ol the experiences and encounters which need to be 
described it is possible I or :i i:isc stud\ in use mot conditional methods 
SL'^J? Julh»ctinn .is well a^ in "ih|c» 1 toe ia:>e t < m » * uT t T/ft l tf*j i ifiaT i 
^fhmjTccnt of traditional experiments 

Regardless of how you determine the list of critical 

Characteristics , A ,is * im ; 01 Ito critic il ic;iiures of (he program will give von some 

'^ , '\Vl)^^ !Ch l|ncM,, ' m ,n <- l,: 'P^ r 2 to answer in yum implementation 
*|MifcTr>ou aie a summalive evaluaior, tlun \ unr task will be to convey 
to your audience as complete a depiction ol tlie p. ogram's crucial charac- 
teristics as possible. 



187 



( 



It you are a tonnativc cvaluatnr. 1 1 i*mi vour decision ilimu what to look 
at might have to u<> a sten h*- muI listing t lie pro^iant> uitical features 
Since > our job is lo help with progum improvement and not metelv to 
desenbe the program vour task is u« tiled inhumation that will he 
maximalh useful /V/ hclpma tin pro«r {i n\ muij t - improve tlu program In 
niost uses, tins will ^ciiaiulv nu t n mominiine the implementation of the 
program's most ciilical featmes Rut \oi, will need to consult with the 
program stall to find winch .iiimng all the riorum's critical Matures seem 
most troublesome m them, most u: need ot wgtlant attention, or most 
amenable to chaiuv It could he tost jikc that pioaianfs most 
c ritK al teat tu 1 is .'inplox mcnt o| ud"> Bui oi\c th. aide* h i*c armed .»nd 
it has 'w established Hut ihe\ .otne t<» vv«»ii »e«:ularlv attention lo (Ins 
detail niav nor fie iiccesv'.y \ « ui loiniatu- %ei ic. to the piti<*rnm will be 
more uscfulh. emphncd in ummii t« m o;o the implementation ot p»i,grair 
aspect* about which ih: Mall ha, genuine piohlcnis to soke 

Question 3. How Much Sanation Is There in the Program 7 

^oiir Jmice ol whiv.li ptnuiam eh.uaciensiio lo dcsciibc will he 1..1I11- 
diced hy the amour 1 : .»/ \ titration that 01 curt auoss >t»es v. here the 
progtam is being used and vniaiion ih.it happens at dillercnt points in 
tune. r<>r one thine depending on the point id Mew ot the planners. 
Vtii t.ibilitv might he considered desoahic or muicsi»ablc S«»mc pi obtains, 
aftet all. em*mrai»e \anat.011 Duectois of Mich piogums ha..- s.i d lo the 
stat f or to then delegates at diMercnl site- soinethiiiL h ! -e the lolhiwiug 

The ('unit (urruulum oftue h,'\ t ho\cn six nutiing programs which 
we can purelm\e with our new f e lend C'ompensut *r\ / ' "at ton 
mnne\ I xurrurie these. and \cLtt w one \<>u think hvst suits Your 
Undents awl teachers. 

it is hkelv that ;:»i evaluaior either formative*? »r sununative will be 
ca r ed 111 to examine the whole Federal Compensator) Education program. 

and he v J M probably find six \ersions of the progiam taking pL.e Here 
variation -cross sites has irecn plann \i. and implementation \ each 
reading subptogfani will have to he descnoed separate!) W, suc } > 
planned variation occurs, incidentally, the evaluaior has a uooc pporiu- 
nity to collect information that might be useful tor luture jjlanr 1 in this 
district or elsewhere particulurtv if tlie dtstiict^vfeiifc to j v the 
number of reading programs to tev\c than si\. He can conipav 'ne ease 
and accuracy of implements. ion and siicc ss with student ot * . varum* 
programs across sites. Where dilfcicn' programs ha\e been nnpie" n'cd by 
sites that are otherwise similar, the e\aluator can compare :e<. »o gam 
clue*; about die relative effectiveness ot the programs. 

Program directors could have allowed »he program it 1 \a r \ - ju ever 
less controlled wav h\ saving 

I't'/rjie Xdolia>s to trrtpnnt our teathn*! piouKtm for th* - > *• math 
disadvantaged. Take these Jutu's and put i* wether a new pr „* 11 

Has kind of directive piodnc-.- a i^otiam vvlu ^ onl\ uomm. eatuies 
across sites jre likcb to he th.* ti.-et linden's .md the linn. 'iirce' 
While variation r Iso planned m t!u« hnd o| situation, unlike *' . ."oeram 
in the preceding c v nnple. euh site has been lell Itee ai cr,. . m < \ui 
unique program. 1 lie district- wide C 1 ahinu»r will iia'^e to took > K -. teh at 
each different version of the pio^inH^yt^cu^erges, ptobahh :^ttng a 
theory-based, case studv , or uapua^J iatuwJ*»»» method T \A\ he 
nwy find a chance to make companions among the program % . ns put 
into eftect al each site, he will probably spend a great de~ 't time 
discovering and reporting about what each program variation ! s?d like 
However the simple act ol telling the impleinentors jno»n * . '.annus 
forms the program has taken will he uselul Most orobabh . s-<r . rms of 
tiie program will be more easil) implement :d. pmdu.v bet'e, r,- -or be 
Tiore popular than others. 





A program can uth>rd to permit considerable v; ii.itmn a< - »ml\ 
tn its early stages when it can make mistakes with minim - .mi ot 
penalties, for this reason, dealing with planner variation sh«» . _ be pi i - 
aianiy (he concern ol the lotmattvc cvaluaior woose responsih * would 
then eriuil tracking the variations, comparing results of diflere " -efsions 
of the program at comparable sites, .md sharing information a ; com- 
mendable practices. Uikfoitunatcly, funding nt^'iiues olteu recjuc *umma* 
ttve reports at a time in the life <>| ;i program when lousidctahi. i.ialion 
still exists. When this hupp .s, the sunimahvc evahiator shoulc *'.e that 
several dijjvrent proguui renditions arc heu.g evaluated lie • ;ld cle- 



whc b re pSs'ibk ,lleSC ' rCr "" m "" S %epara ' dv - ,naku, P LO "'Pansons 
If the cvHuator- who I net summaiivo o. formative should uncover van- 
at.cn across s.les or over tunc that lias not been planned, then he will have 
to descr.be tins collecting backup data .1 he feels that he will need 
corroborating evidence . 

on k. When Do You Heed Supporting Data? 



You might need to desjrihe program implementation Tor people who are 
at some distance irom the program, either in terms o! location or lamiliar- 
*ty These people will base their opinions about the program's torni and 
quality on what they read in your description You might therefore need 
to provide backup data to verify its accuracy 

If the description you produce is foi people close to the program and 
familiar with it, then you can relv on the audience's detailed knowledge of 
the program in operation at least in their own setting. In such a case, vou 
may want to focus your data collection on the extent to vihsch the 
program's implementation at one site i< representative of its implementa- 
tion at other sites. The credibility ol mm icport foi people close to the 
piogram will, of course, depend on how well vour description of the 
program matches what then v;e. II *ou led thai your report of overall 
program implementation diverges considerably |,oui the experiences of the 
program's administration or of participants at inv one site* then vou ma\ 
need to collect good, hard backup data. 

Examples of more <wcih\ circunistano Lulling lor backup data are 

• S native evaluations which coiMihile research studies adtftessed 
to . 1 cducalion.il commumtv ai l;ucc \ 

• Evaluation* aimed at providing new information |nr ,i situation 
where there is likely it) he contnueiw 

• F 4 ablations calline lor prouiam implementation descriptions so de- 
tailed that they diaractfi/c piogram activity at the level o| teacher 
or student behaviois 

• Descriptions ol programs thai mav ^e used as a basis for adopting or 
adapting (he program in oihei settings 

• Descriptions of programs winch ha*e vaned consideiably Loin site 
to sue or from tune to time 



lS'J 



BEST 



How mhi use backup d.ilj will he deter mitictl in pari h\ wind, ol inc 
approaches In describing the program v mi adopt 

I Using the piogram plan as .1 baseline and examining how well t" c 
program as implemented fits ilk' pi m 

2. Using a theory or model U> decide the lea tines that should be present ■» 
the program. In this <.asc vou will piohabh consult irscarch htei Jt 
or prescriptions ot \ uioiis philosophical u psychological points ol vie* 
foi gmdaiKC in what to look lot In both this and the ;>ian-ba^ 
approaches, backup data will he ueeessar> to permit people to judg* 



how closely the actual program fits what was planned Such data could 
also help vou document your discover} ol progum leatures that weie 
not planned. 

3. Following no particular prei.nption and instead taking a responsive/ 
naturalistic st nice regarding Ine program, In this situation you will 
attempt to enter the program sites with no initial preconceptions or 
assumptions about what the program should look like. 

If \ou assume either of the firM two points of view concerning focus 
and use of data, your tinal icport will describe the fit of the program to the 
prescription vou have chosen to use. In the thira situation, your final 
report will simplv describe the program that yon lound. noting. of couisc. 
variab htv from site t" ^itc. 

I his chapter has discussed the measurement ot program implementa- 
tion with a view toward making this aspect of youi evaluation report 
reflect the needs of your audiences, the contest you are working in. and 
your own p r ofessional standards. To help (jmire that v our reports will be 
useful and credinle. this chaptei has been concernrd with the critical 
decisions you shoald make hmu vou begin uuu evaluation winch 
features of the program your es 'Uiati'-u should fouis on and how vuii vvill 
substantiate >our description ot the program T" help vou wuli these 
decisions, your attention has been directed to tWS-Tcv questions 

I. What purposes will \ out implnneniatton stud} serve 1 
2 What « e the progtain's mi ciitical characteristics' 
3. How much variation is tluic in the po^anr 

4. When do you need supporting da+.a? 



Your implementation evaluation should be a< meihodologuallv sound as 
you can make it. And. as when dealmt, aiiIi achievement and attitudes 
your report should provide uvJibh* ami above iM wetut uifomution to 
v our audiences 



190 



Chapter 1, Page 2£ 
Endnotes (These will become footnotes in the published version) 

1. In general, describing program implementation is considered 
synonymous with measuring attainment of process objectives or 
determining achievement of means-goals , phrases used by other 
authors. The book prefers, however, not to discuss implementation 
solely in connection with process goals and objectives. This is 
because the primary reason for measuring implementation in many 
evaluations is to describe the program that is occurring — whether 
or not this matches what was planned. Other times, of course, 
measurement will be directed solely by pre-specif ied process 
goals. Describing program implementation is a broad enough term 

•to cover both situations. 

2. Audience is an important concept in evaluation. The audience is 
the evaluator's boss; 8he is its information gatherer. Unless she 
is writing a report that will not be read, every evaluator has at 
least one audience. Many evaluations have several. An audience 
is a person or group who needs the Information from the evaluation 
for a distinct purpose . Administrators who want to >eep track of 
program instalJ at ion because they need to monitor the political 
climate constitute one potential audience. Curriculum developers 
who want data about how much achievement a particular program 
component is producing comprise another. Every audience needs 
different information; and, important, each maintains different 
criteria for what it will accept as believable Information . 

3. See, for instance, ??? UPDATED VERSION OF HOW TO DESIGN A PROGRAM 
EVALUATION. See. also, the tf L tep-by-Step Guide for conducting a 
small experiment 1 in ???, UPDATED VERSION OF EVALUATOR f S HANDBOOK. 



tf> An excellent presentation ol the implications ( >| v. ir ious mo lds >,t sUioolinv; 
and education is put forth in Jovic R , & Weil. M \tmhtt ,>/ uathw < \ n-Uwuod 
(lilts. NJ Prcnricc-llall, 1972. Sec. as well Kohl. It I In* ppon ilustrnoiti Now \<>tk 
Random iloiisi, 1969 and also Neill \ S Snmmeri ill New \ ork Mir I9f>i> 



IjCooley, W W , & I ohncs, P R htahmlum nstan h >nediu t .ii* >• \e • Y«.,k 
">ington PuHishurs, 1976 



It 4. Repnntul lioin ( oolcv W A' . & I olmo P R / itihiiitttn r,'.,i#i* *" 
education NewNork Irviiiuton Publishers. I*>7ft. p l*>l 

7 jfr Leinhardt, (i Appl>mc a ilassmom pr-ness itnuM i«» nuirm n« 'ul w-.l- 
uation Curriculum Inquiry, 1 978 8i2) 



% For a more detailed discussion of how to conduct a qualitative 

evaluation, see Patton, Michael Q. , (Kit Book on Qualitative Methods) 
and other references listed at the end of this chapter 



(J. Whore there is nth formative e^ >r worl.im: vuth the prnyram distru t- 
wide, she will hecorne mvnhcd with assi^mu variation .uul perhaps slurme ideas 
across sites. Where there is a separate lormahve cvthiator at **u u me ndi c^ahialor 
will work according to dillercnt priorities I he |oh ot eaUi evaluator will In- to see 
that each vc* ton ul the program develops as well as possible, perhaps disregarding 
what other sites arc doin>i 




191 




I luure I A IIMhI"! "I |SV"'MII pi'iu'Wv 



ERIC 



BES^ COPY 

192 



Chapter 2 

QUESTIONS TO CONSIDER IN PLANNING AN IMPLEMENTATION EVALUATION 
ONE HUNDRED QUESTIONS FOR AN IMPLEMENTATION EVALUATION 

Chapter 1 presented an introduction to implementation 
evaluation, including a rationale for conducting both formative 
nd summati ve evaluations and four questions tc help you iaiSiafrc 

planning pp o ooo > -ftrr such an evaluation. The purpose of this 
chapter is to help you continue your initial planning by listing 
many things you might want to know about a program you're 
evaluating — but might not think to ask. Such a claim to 
inclusiveness is at least in part facetious; every program has 
unique features that w^ll generate questions of a highly 
individual nature. But at the same time, the list of questions 
that follows, generated from the experiences of many evaluator^ 
with many programs, may help you to focus on aspects of the 
program you might not otherwise have seen as important. 

It may help you to think of this list as an outline for what 
you may want to report to individuals who will use the results of 
your evaluation. The headings and questions In this chapter are 
organized according to what could eventually become the fi re 
major sections of a formal implementation report. Put at this 
point in your evaluation, you should not worry about what your 
final product may look like. Research on evaluation use has 
taught us that useful reports take a variety of forms, from 
casual conversations over coTfee, to working meetings, to formal 
document typed and bound. 

At this 3tage in your planning, you needn't worry about 



ERIC 



193 BEST COPY 



Chapter 2, Pafce 2 



report format, but rather about tiie specific information you need 
to collect in ordei' to answer the program's most important 
questions. If you are conducting a formative evaluation for 
immediate program improvement, jot down questions that would 
enable you to quickly provide information for suggesting 
strengths or for effecting changes. If, on the othor hand, yours 
will be a summati ve evaluation documenting a program's 
implementation, target instead questions that will enable you to 
create a meaningful record of what happened in or as a result of 
the program. 

In addition to choosing what to describe about tne program, 
you will need to decide which portions of your description must 
be supported by corroborating evidence. The necessity to collect 
supporting data to underlie your description of some program 
features will, of course, be primarily a function of the setting 
of your evaluation. But there are program features which, 
because of their complexity, controversial nature, or critical 
weight within programs, usually require backup data regardless of 
the context. To remind you that your description of certain 
features may meet skepticism, an asterisk appears in the outline 
next to questions whose ao3vers could require accompanying 
evidence . 

Outline of remainder of chapter — The list of questions will be 
divided into three sections, as follows: 

A. Program overview 

1. Setting (vh*\t is the program's context?) 

2. Program origins and history 



ERLC 




Chapter 2, Page 3 

3 . Rat ionale, goals , objectives 
k. Program staff and participants 
5* Administration and budget 

B* Program specifics (critical characteristics of the program) 

1. Planned program characteristics 

2. Questions for examining program materials 

3. Questions for t Mning program activities 

C. The evaluation itsell 

1. Purpose and focus 

2. Range, of measures and data collection 
% 3. Timeframe 

Summary sect ion at the chapter end emphas izing that not all 
questions fit every evaluation and that you won't always 
write up every bit of information you get (i.e., use 
these questions to help you frame a good evaluation , 
focus the evaluation process on the issues where you can 
or should make a difference) . 



195 



ERLC 



3-1 



Chapter 3 

HOW TO PL4N FOR MEASURING PROGRAM IMPLEMENTATION 



Cl.ap.er listed reasons I.., Illc |„d.ng .Kuna.e piocram ilcwnni.on ,., 
>ou, wJiubon rcpo,, These reasons nnlndcd .he ^d ,„ sc. di 

I ^ , ! P " ' "' C pru8ram !,K " c " ,,ld * 1 »KTkuti.H. 

pr....de j l«v „ n ,akHiji emetines jh,.,H .elai.onsh.p, helwee.. 
-£» '«•""»» -nd P-oyr,,,, etfecs. ami u. colled * ,L|„, . 

T T'""'"' ( CV - ,kl:i "" lvl " bc •'■'«•«> «l«c.in.eii..i. 8 
•ST; ""Ple-nenu,,,,, for t „,e ; ,,.„e ,, ,e p„,p,«« TV 

5 f r ; ,)1 »» '-.I v,llr„„ la „h heJn.clnedal 

H Si . "' ,lK ' ,,r " u " s 1,1 ; ,, "t !r: "" '••«JH»i'««. 

*«S I "lit pu ' r 2 pr, ; ,,:,,,K i,c,|H,! ■•■ 

wh d hTT -I.Med deus.on 

rfyou desu.pUons need 5i.h*.j,mai,n B . iha. ,s wind, p, r ,s 

' our report need ... he batU-d ..n In da., ,,h,d, sou colleu 

«m .c" ;ir , . nm,j T v jl " : " n * ,he sl "" ,,cs, ™y ,u i--™- 

^for', ,0 " •" l,,,,ms,ry """ "-.luM.ma.dv .he leas. ade- 

"Por. II P '"'T ' "' l,,, " bl0 u "" l""P '"H--'»'-«l a «M.n 

* Wv" ;;,; c t x " r T a ! ,n Mmc ,,ui " ,,,e *d ii- 

'* J < is E L! ' '"V I "" MU|ISV cv - ,lt '-"'" t»h ,s i„ ,epo,l ah,,,,. 
I^d nl"" "7 l """ r:,m S,U>S v "' ^ 1 MP l«>l iH-cme 

'^•^17 , T ,, ! K,,H, ' ,,,,n ,,,cjs " ,tf ' w " 'i *•«• w.nd„ 



C ^Up d t!t ;i lot \ mil M 



best copy 



o 196 
ERIC 



ERIC 



Methods of Data Collection 
This book introduces you to three common approaches for 
collecting backup data for your implementation report. The use 
of any one method does not exclude use of the others. Your 
selection of data collection methods depends on the extent to 
which, given available resources, your report will provide 
Information that your audience cons iders accurate and credible . 
The method or combination of methods you select will be primarily 
a function of three factors: tne overall purpose of yovr 
evaluation; the information needs of your audience; and the 
practical constraints surrounding the evaluation process- 

Method 1. Examine the fiecords Kept Over the Course of 
the Program. The first method of data collection requires that 
you examine the program's existing records. 

These might include sign-m she Is lor materials library Km record-., 
individual studenl assignment cards, teachers' logs uf activities in the 
dassroom. In a program where extensive records arc kepi as a matter ol 
course, you may be able to extract from Ihcm a substantial part of the 
iM * you ncci ] t0 determine *iiat activities occurred, what matcnals were 
w<d, and how and with whom activities look place and materials were 
This method will vicld credible evaluation information because 't 
' j* ovi( k* evidence ot program events accumulated (/s (her occurred rather 
1 reconstructed later. The major drawback ol existing records ^ that 
tibsincting information lioin them can be time consuming. Ttott-agtmi. 
kept over tltc course of the program will pmbably not meet all 
data collection rc(|Uireinents. K it looks as though the existing 
J^ttdsarc inadequate, you have (wo iltemativcs Tlic bcsl one is to set Irp 
ok/i record* keening tvwrw, assuming ol umrso, that yod have 
fn ^7 on ln « scene m time to do this. \ weaker allcrnativc is lo gather 
^ tcteJ versions of program rccoids Irom pjiliup.mts Should vuiuhi 
^ Point out in your repoii the exicnl to which this iiiUhiimI on has 
corroborated by more lormal records oi results horn oll.ei measures 

Method 2. Use Self-Report Measures. A second data 
collection method involves having program personnel and 
participants — teachers, aides, parents, administrators, and 
students — provide descriptions of what program activities look 

? r llke ' 197 BEST COPY 



3-3 



4 - || ni tl ^ cs Vl » MH . ,,j c ,„iiso h. luiu loi inhumation 

about j proL'inm to the people win* woiled with u ^ mi hhjIh Jioosc to 
mtcrucw people 01 $.i\e thun , fl * t « »n» an \ U .,»|| w muj mloi ma'ion 
hom ncnoth' who espeik'ked the pituji.ur "ill tak i* min.li elfmi 
•md lime then loi djviiplimiN ,.| aumtio hom <y \ itnpfc ol people 
within e;i'.h iole iiroup 

Suae dillerenl ^n-i-i-, oj paiiiupjn>> in - po»M,,m m ijln | MU . diva pent 
[XKcphoiiv wtu m .i\ wjnt to gjilk- n .||. UP ., ( n 1 1 1 m tn 1 1 ion probablv un 
.1 supple h:isis (Mm kadici, .)<hnmoiui<u. p.muu t uul students aim 
Compaic the nitoi nun-n pi..\uled u .hi 1 erotips Hv ,t vuU eet J 

<- '^SMeill. < » | U' llit U flV ^ | H| Ok I UK' < .ll»OM! »|| v ' ; M ,MJ| ||, • 

Be .man' th.i *///•/■.*/ mea^^ ~iuK> l„i\e vi-»lihim\ pioblenis 

llC|V-tutllli« on llir MHirlh-H IUiijIL 1V ..pl . |.. loth, pii.j.iin \\||j I, lid 

inhumation io K- ..^liH, |', r K hi ip.rn ih. pnvi tin ltn in- 
stance .it the fuu.iniL .«j C ik; jic kU likch im*i . • , Mepui t inhuma- 
tion hom the sl.iM I im ..| ,,|| r|,. JU in ih pn.Mluliu tint people 
P«o\idiHi! uiu with iiiioiriuinjii Im. v ,i ■^i,,| mic»-i m tn ,kmj! the 
piugiain look L'lHiil ll.^frf^Mir i\en v lu n mMtMoiiil hi.^ ls unlikdv 
sell-iqu.il dev iipii.i.1, i ;„„.;,,„„ >u .,i hoi L.ond IuiiJ juoimts ol 
vvh.n rijiispin.l ihe<»,////,//w tells ih, ,///,/„ ,/,, ulut people wr ///( ! (//(/. 
Hurdk H'lt-iepoit in!..M, U |, nn nHlM L(UN M', ( ,| letollecHons altci-llic- 
la-. I ul peoples^nw/ lvki\iui Alu.ui k ol ulial people icitKiiihcr h.ivni|! 
done themselves aie usualk not as edible .is de^nptmus h\ olhcis who 
actually miw wh;it tl»e\ did 

Bcluuh' ol then uedibihh piohlem , and the detail vs ttii Hudi progiain 
implementation usualK m.eds to he doM.iihe«' .Ji-upi.ii nistt rttnctilb are 
mote ulten used l*> souls oi mdicd on ihe . . >;\i\tat< \ ai mss sites ol a 
program dcsciiptmis jimvcu at h\ mm, dmui me ins Oul\ when the 
evultijtot\ ivMimici aie too limited to [Ktmit tolleUioii n! i!o\c-up data 

do self-report measures constitute the pnmar\ source ut miplcmehtation 
information. 

Method 3. Conduct Observations. The final data 
collection method discussed here is that of actual observations 



of program activities, having one or more observers make periodic 
visits to program sites to record their observations, either 
freely or according to a pre -deter mined list of questions . 
Although it can require a great deal of time and effort, on-site 
observation has high credibilty because the observer watches or 
even participates in program events as they occar. You can 
enhance that credibilty by demonstrating that the data from the 
observations are reliable, i.e., consistent across different 
observers and over time. 

BEST 



er|c 198 



COPY 



To help you in thinking about which methods of data 
collection are appropriate for your own situation, Table 2, pages 
?? and ??, summarizes the advantages and disadvantages of each of 
the methods. The remainder of the book then devotes a chapter to 
presenting each of the three data collection methods in greater 
detail: 

Chapter h discusser how to use records to assess program 
implementation , describing both how to check program records 
thdt already exist and how to set up a record-keeping 
system. 

Chapter 5 describes ways of using self-report instruments 
with staff members, parents, students, etc., giving 
step-by-step procedures for constructing and administering 
questionnaires and conducting interviews* 

Chapter 6 discusses program observations, describing methods 
for conducting both informal and systematic observations* 
The section on systematic observation presents several 
alternative schemes for coding information, one of which 
should fit your needs. 



If it is immntanl that >mi describe a program lea line accu> »m| ««i 

il vour audience might be >kcptkal. Mien vou should fr\ -»-,v. 
converging data Tins rcquues usme multiple measures and data mik-i ., 
methods and gathering data horn different paiticipa'tts at different m' v . 
For example, if von weic c\ aliiatint? j pnmram based mi mdividuali/ation 
>ou might want In document the cMent to winch iiAiniclmn rcall> is 
determined according to individual need, To assme enough evidence. > nu 
could collect diltcicni kind^ ol data Ma. be \mi would mtentew students 
a the various program <ato about the sequence atid pacing o{ ltlC tr lesson- 
ed the extent to winch msiiiiuion nums in umups. To ennoboiaic wlut 
>ou find through student inlciview. \ou < <mld examine tlie tcttthcis 
rtcord-kcepm>> s\ steins In an individualized prelum it is hkclv ilui 
leathers would maintain Jiarts oi picsuiptmn lmms tiackin» individual 
Undent pnnrcs* f uiall\ >nu mi eh I conduct a tew *>h\tnuti<>m oi spm 
checks, watchmy typical elates m session to estimate the amount ol 
^dividual instruction and pm^icss-innuiloiini! pei student both within 
and across sites Three sources ol inhumation mimic \ss. examination ol 
Ncords. and classroom obscrvalnin umld then be lepmled each support- 
m *or qualifying the findings ol auotliei 



BEST COPY 

19'j 



3-/ 



Where To Look For Already Lxisting Measures 

fcfoie you involve vmirscll m llitr niienui- Ihimiicsn nl ^i-ume win nun 
^Plcmeuuuion measure* >mi midii uk a look .11 m .u unicnis .ilic.id\ 
liable Some measure- uumlv ob^tv.timn mIiciIuIcn ami .|ucstioi. 
**m, have been developed which tan he used !o describe j-cmcijI Jlh.k 



teristics of groups, classrooms, and other educational units. Titles of these 
instruments often mention. 

• School or classroom climate 

• Patterns of interaction and verbal communication 

• Characteristics of the environment 

If you wish to explore some of these, check the« anthologies. 

A 

Bonch, G. D., & Madden, S. K hvalm\\n% classroom instruction A 
sourcebook of instruments. Mcnlo Park. CA* Add-on-Wesley Pub- 
lishing, 1977. 

This sourcebook contains a comprehensive review of instruments for 
evaluating instruction and describing classroom activities It lists 171 
instruments, describes each along with its availability, reliability, validity, 
norms, if any, and procedures for administration and scoring. Each is also 
briefly reviewed, and sample items are provided. Only measures which 
have been empmcallv validated appear in the sourcebook Flie instruments 
are cross-classified according to what the mstiument describes readier, 
pupil, or classroom) and who provides the information (the teacher, the 
pupil, an observer). 

Boyer, E. G., Simon. A.. & Karafin. G. R. (Eds ). Measures of maturation. 
Philadelphia: Research for Better Schools, Humanizing Learning 
Program. 

This is a three-volume anthology oi n ? earlv childhood observation 
systems. Most of these systems were developed lor research purposes, but 
some can be used lor program evaluation 

The 73 systems arc classified according to 

• The kinds of behavior that can be observed (mdmdMal actions and 
social contacts of various types) 

• The attributes of the physical environment 

• The nature :uid uses ol the data and the mannei in which it is 
collected 

• The appropriate age range and other characterises of those* 
observed 

Each system is described in detail 

Simon. A. & Boycr. I G Mtnots jor Miawor An anthology nj t loss- 
room observation instruments. Philadelphia Research tin Beltei 
Schools. Center U\* the Study of Teaching. I ( )74 
Tins collection piovides abstracts ol l )9 classroom observation systems. 

Each abstract contains information on the sublets i the observation the 



? n BEST COPV 



ERIC 



setting, the methods of collecting the data, (he tvpe of behavior that is 
recorded, and ,|.e way, in winch the data can be used In addition an 
extensive bibliography directs the reader to further information on these 
systems and how they have been used by others 

,JHvTJ^H dili0n f " ,iS W ° rk (l%7) pf0Vides de,ailed descriptions of 
twenty-six of these sy**ems. 

Pn< l972 L Handbook ot ' organizational measurement. New York: Heath, 

This handbook lists and cUsilics measures which describe various 
SET ° f , 0r ^ ,,7 | al ! ons - ™* are applicable, but not limited to. 

schools and school districts. The instruments are classified according to 
organizational characteristics, eg., communication, complexity, innova- 

T n ; Xn l ?r i ^ d ' n,,eS CJch d "»^ht.c and ,ts measure, 
men . Then it describes and evaluates instruments relevant to the chaiac- 
tenstic mentioning validity and reliability data, sources from which the 
measure can be obtained mid references for additional reading 

Planning for Constructing lour Own Measure 

Regardless of which methods you finally choose, your 

information gathering should include four important 
considerations, each of which should be thought through 
(outlined? addressed?) before you begin date, collection. These 
planning bases are the following: 

1. A list of the activities, materials, and administrative 
procedures on which you will focus 

2. Consideration of the validity and reliability of the measures 
you will use 

3. A sampling strategy, including a list of which sites you will 
examine, who will be contacted, interviewed, or observed, as 
well as when and how often 

4. A plan for data summary and analysis 

1. Constructing a list of program characteristics 

Composing a list of critical characteristics is the first 
step in each of the data gathering procedures outlined in 
Chapters 4, 5, and 6. Constructing an accurate list early in 
your evaluation will help insure that program decision makers 
receive credible information they will later be aole to use. 



2oi BEST COPY 



A thoughtful look through (he program's plan oj proposal, .i talk with 
staff and planners, your own thinking about what the program should look 
like-perhaps based on its underlying theory or philosoph) and carclul 
consideration of the implementation questions in Chapter 2 should help 
you arrive at a list of the program material^ activities or administrative 
procedures whose implementation you waul to track. Make sure that the 
program features you list arc detailed and exhaustive ol those consid- 
ered -by the staff, planners, and other audiences to be crucial to the 
program. Detailed means the list should include a prescription ol the 
frequency or duration of activities and ol »heir Jt>rm (who, how, where) 
that is specific enough to allow >ou to picture each activity in your mind's 
eye. 



If you arc looking at a plan or proposal, then critical features will often 
be Ihose most frequently cited and those to which the largest j?art of the 
budget and other resources have been allotted For example, if large sets of 
curriculum materials were purchased for the program, then ore critical 
part of tL* program implementation is the proper use of these materials. 

If your work with the program will be formative, then you should 
atlend to parts of the program that arc likely to need revision or cause 
problems. Try to visit one or more silos in which the program is operating 
and observe the environment, ihc mate: als. the people, and the aclivitics 
before you consider youi list of program features complete. This way, you 
will be able to envision flic actual program situation when you construct 
implementation instruments. 

The program characteristics list can take any form that is useful to you. 
If you think you might use it later m a summativc rcporiyor as a vehicle 
for giving formative monitoring reports to stalf. consider using a format 
like the one in Table 3. page(S^. This table can serve as a standard against 
which to measure implementation. For summative evaluation. Table 3 
could convev adequac\ of implementation by adding two additional 
columns at the right 





Assessment of 

adequacv of 
inolenentat ion 


LP' 1 ' 
Etae-kup data 


[all 







You might prefer to begin with a less elaborate materials/activities/ 
administrative features list than is shown in Table 3. The following 
example presents a simpler one 



best copy 

202 



Simple. The proposal For I mcisnn School's peer luiorin* mop 
contained the folloum* paragraph " 1 utor.ng ac »« VI \ IC ; ^ 
pbec three davs a *cck m the third, fourth, and fifth jirade cl»roo^ 
during the 45-mmtHe reading period Group 1 (last) rca< Icrs will hcacn 
assigned one slower reader whose reading scatwork will become w 
responsibility. All tuturm K will be d<me using ihc cscruscs in the 
and Say" workbooks *h,ch were purclnscd lor the program, uur 
tutoring, one tccher and one aide per classroom suil circulate i w 
student pairs, anssvermr. qmMioiis and informally monitoring tne i 
grcss of tulccs. lulor-tulcc rotation will lake place every 
months . . 



Thr assistant principal, piven the job of monitorine the program s 
proper implementation, constructed :or her nwn use a list of program 
characteristics which included her own informal notes 



Peer-Tutoring Activltlea 

Fro 1 "- written plan 

* frequency — 3 ttmrs a week 

* Duration— 45-mInute region 

* v/h->— 3rd, 4th, 5th graders 
' Where — classroom^ 
' Fast readers teach slower 

' Must "have rcsponstbt \ i t' *• it does thi« mean 7 
(Director says it just mean* thev will tutor seme 
child all the time) 

* AIL tutoring from "Read ind Sav" — in order, or can 
TheV skip around 7 (Third grade teacher says in 
order ) 

* Teacher ind .iide cravel fn-n pair to pair 

' The' "nonitor"— it the ' t>r«ul record-keeping 
sv^m 7 fMretfr sn /« . « — r« nrHlng sheets hivv 
been drawn up ind provided) 

* Tutor-tu«e rotate after two uonths 

ft.Mtrl.ml data f r- inters with MjJ Keating 
Specialist and Prnjcut Director , and Ms. Jon**, third 
grade teacher : 

Jrd. 4t.i, and 5th pnde tutors in their ovii cla^; 
mom; no switching "ins • i 
' Teachers and aides—any difference in role*: vis-a- 
vis tutors 7 ! : " 
' What did average readers .lo' Worked alone or in 
pairs with other average readirs. tutored when a 
tutor was absent-does this ca.wc d isrupt t veness 



203 <J 



3~8 



2. Consideration of the validity and reliability of the 
aeaaures you will UB e J 

Once you have constructed your list of program 
characteristics, you should next think in a general way about the 
type and content of instruments the, would be appropriate for 
collecting data, on those characteristics that are of most 
interest to your audience or those that have the potential for 
controversy. One important consideration in your planning is the 
technical adequacy of the implementation measures you will 
choose-the validity and reliability of methods used to assess 
program implementation. Even if you are not a statistical whiz, 
you should make sure that the instruments you eventually use will 
help you produce an accurate and complete description of the 
program you are evaluating. 

Assessments of the validity and reliability ol a measurement instrument 
help to determine the amount of faith people should place in its icsuhs 
Validity and teiiabihty refer to different aspects ot a measure's credibility 
Judgments of validity answer the question 

Is the instrument appropriate tot what needs to he measured'" 

Judgments ol rehabiliu answer the question 

Dr>es the instrument eteld a insistent results ' 

These are questions you mu*( ask about ain method > on select to back up 
your description of program implementation ''Valid" nas the same root as 
"valor" and "value" 1 it indicates how worthwhile a measure is likely tp be 
foi (ellinp vou what \ou need to know V:ilidrt\ hulls down tQ whether 
the instrument is giving you the true stor> m at least yunething'approxi- 
mating the truth. 

When reliability is used to describe a measurement instrument, it car »es 
the same meaning as when rt is u<ed t«» describe tnends \ reliable friend ts 
one on whom you can tount m belure the same wax time and again In 
tins sense, an ohser\atton instrument r|ue<tiommu£ ui interview schedule 
that give* \ essentialK iIk* same icsults when readministered in the same 
.etting is a reliable instrument. 

But while reliability refers *o consistent v. consistency does not guar- 
antee truthfulness. A triend. tor instance. who compliments your taste in 
Jothes each ntnejihe sees \ou iscertami> reliable but may not necessarily 
be telling the truth. Further. Alio mav not even be deliberated misleading 
vou. Paying compliments may be a habit. o* perhaps htf judgment of how 
you dress may be positively influenced b\ othei cood qualities you 
possess It ma> be that b\ a mote objCLiive standard \-u and \oiir friend 
have terrible taMc in Uoi.vs* Sinulaiiv. simply hciiui'c n instrument is 
reliable di»cs not mean that it is a pood measure of what it seems to 
measure. 

ERK 201 BEST COPY 



3~f 



You arc measunne. r.ither iluu <nnplv dewtbmg the program on the basis 
of what someone saw it looks like, because vou want to he able to back 
up what you say. Von jre trying to assuie both yourself and \our audience 
♦hat the description is an accurate representation of the program as it took 
place. You want your audience to accept vour description as a substitute 
tor having an omniscient wew of the piograni. Such acceptance requires 
that >ou anticipate the potential arguments a skeptic might use to dismiss 
your results. When measuring program implementation, the most frequent 
argument made b\ someone skeptical of v our descnptit.i might go some- 
thing like tin?' 

Respondents to an implementation i|iicMionnairc or *ubicct*ol , observation 
have an idea of what the pro-am is supposed to loo'^ like regardless of 
whether this is what thev usitalh do m fact Because they do not wish 10 
appear to deviate or because lhc\ fear reprisal*, the*' v ill Ik ml their responses 
or behavior to cntorm iv a rr"'!y l oi how ■ hr\ kt'l thc\ tmnhi to appear 
Wtii're this happen*, the instrument ol course, mII not measure the true 
implementation of the program. Such an m*rrumc,it will be imalid 

In measuring program implementation, concern over instrument valid- 
ity boiis down to a lour-p^rt question" !s the description ol 'he program 
which the instrument present acctraie. relevant, representative and 
complete'*^ 

\n accurate instrument lllov.s the evaluation audience to create for 
themselves a picture of a program that is close to what die) would have 
gained had thev actually seen the program A relevant implementation 
measure calls attention to the m**u critical features of the program -those 
winch arc most hkel> related to the program s outcomes and which 
someone wishing *o replicate the program would be must interested in 
knovvnn. ;ibout i^- 

rfilvjSrcTen t alive description program implementation will present a 
typical depiction ol the prow am and its sundrv variations as they appeared 
across siles and ovor time. A complete picture ol ti.e piogum one that 
includes ail the relevant and important program features 

Making a case for accuracy and relevance 

Vou can defend the accuracy ol wnn dcpntmii ol the program by ruling 
'nit charges thai there is purposclul bias or distortion m the mlormation. 



There are various ways to guard against such elyir^s. Self-report instru- 
ments, for example, cart be anonympus. It you ate using observations, you 
can demonstrate that the observers have nothing to gain by a particular 
outcome ?nd that the events they have witnessed were not contrived for 
their benefit. Records kepi over the course ^f the program are particularly 
easy to defend on this account if they arc complete and liave been checked 
periodically against the program events they record You need only show 
that the people extracting the information from the records are uncased. 

You can. in addition, show that administration procedures ire sYan- 
dardized, that is. that the instinment has been used in ihe same wav every 
time. Make su r c that 

• Enough tunc was allowed to respondents, observers, o recorders so 
that the use of the instrument was not rushed 

• Pressure tc respond in a particular wa% was absent trom me instru- 
ment's format and instructions, from the setting of its admini- 
stration, and from the personal manner of the administrator 

Another ua> to argue that your dev.iption * accurate is to slm'v 
results Trom an> one of your instruments coincide logically with results 
from other implementation measure 





>ou can also add support to a uisc rh.it vour instrument is accurate b\ 
presenting evidence that it is reliable Chough it «s usually difficult to 
demonstrate statistically tli:il an implementation instrument is lehablc* a 
good case lor ichaHilitv can be based <>n the monument's having several 
items that i 'amine each o] the pi>>eiani smmt t ritual feat met Measuring 
something impoilanl, say the amount o! nine students spend per day 
reading silentlv. by means o| one item onlv c\ poses your icporl t 
potential error I mm response lormulation and interpretation You can 
correct this by including several item; whose i epulis can be combined to 
compile an index ( see-p agc~IS-J, 01 by administering :he item several 
tunes to the same nelson 

\f expert* teel that a profile pmduied bv an implementation instrument 
hits major leatmes of the program or progiam component von intend to 
describe, then this is strong evidence that your data are rehvant. Tor 
instance, a Jassroom description would need to include the curriculum 
used, the amount oC time spent on instruction per unit per day. etc. A 
district wide program, on the other hand, might need to Cocus heavily on 
key administrative arrangements lor the piogram, 

Making a case for representativeness and completeness 

•To demouv,, ite representativeness and completeness, you must show that 
in aiinuntstentiti the instrument you 3h4 not omit any sites or time periods 
in which program implementation may4uua* loo!**d-dill Mcnt. You must 
also show that yui have not given 'oo much emphasis to a single atypical 



variation of the program. Thus you data must sample program sites 
typical of each of the different places where the program has been 
implemented. Your Sample should ako account tor different times of the 
day. or deferent times during the life of the program if these arc variations 
likely to be of concern. The variations you have been able 10 detect must 
represent the ranee o r those that occurred 

As you can ..e. there is no one established method for determining 
validity Any combination of the tvpes of evidence descibed here can be 
used to support validity If you plan to use an implementation instrument 
more than once, considei the whole penod of its use an opportunity to 
coliect information about the accuracy of the pictme it gives you. Each 
administration is a change to collect ilu opinions o! experts, to assess the 
consistency ol the view that this monument yi^cs vou with that from 
other instruments, etc Establishing tiMrument \ahditv should be a con- 
tinuing process. 

^Rehabih'y ide p iho extent fr> whu' 1 nio.isurcmur ** i 1 *' i r free ol 
unpiedictabl' errn For cample, if you weit to li. . ts j 

math test • J vviihoul .uldiMonal insinn iion give them .! e s.: 1 

test two da. you would expect each student to receive more or ic 

the same score. If this should turn out imt to be the case, you vv^ld have 
to conclude that voui instrument is nmeltahle, because, without instruc- 
tion, a pei son's knovvlcuge ol math Joes not Uncinate much from day to 
day If the score fluctuates, the problem must be with the test Its results 
.mist be influenced by things other than math knowledge. These other 
things aic ca'Vd ermr 

Sources of error that aflect the reliability ol 'ests. questionnaires, 
interviews, etc., include 

• Fluctuations in the mood or alertness o respondents ^realise of 
illness, fatigue, recent good or bad experiences, or other temporary 
differences among members of the group being measured* 

• Variations in the conditions of use ho^i one administration to the 
next These range fiom various distinctions, such as unusual outside 
noises, to inconsistencies and oversights in giving directions. 

• Dillereuces in scoimg or mtcrpietmg results, chance differences in 
what an observer notices, and eirors m computing scores. 

• Random ellccts caused by examinees or respondents who guess or 
check oil alternatives without tivme to understand them 



206 



Mcihmls !,.. Jtfmi«iNlraliii|! Mnmiciil\ rclubilil> wdcllici lie 

.nstiumcnl is Ion? Jinl iiiirk.ilc w .nm|-MHl ol .. Minsk question usually 
mvuUc uHniM.iny ilu- icmiIIn ->l one ll um...M..iiin„ ,.l ilu« .nsiruinoni with 

another by correlating 24 them. 

Tlie evaluator designmc and imiiu instrument* tor measuring program 
implemen'ation has unique problems when attempting lo demonstrate 
reliability Most of these problems stem from the (act that implementation 
instruments aim at characterizing a situation rather than measuring some 
quality of a person. While a pewit's skill. sa\ m basic math, can he 
expected to stay constant loi^ enough lot asse>sment ol test reliabiht) tv 
rake place, a program c:mnot be expected to hold >till >o that it can be 
measured. Because ihc program will likeU be d>namic rather than static, 
possibilities for tesl-retest and alternate lurm lelumlit} are usualU ruled 
out. And since most instruments used measuring implementation are 
actually collections of single item? which independently measure different 
things.' the possibility of computing split-half reliabilities practical!) never 
otcursf" 



Few program evaluators have the luxury of sufficient time to 
deuign and validate data collection measures. But early 
attention to the validity and, to a lesser extent, the 
reliability of measures will help insure that the information 
gathered during the evaluation will enable the evaluator to 
answer well the questions that potential users most care about. 
An implementation evaluation can be a waste of time if it 
collects data that are technically "good," but that don't answer 
the right questions. Perhaps • worse is the evaluation that relies 
on data that are weak at best. When decision-makers use bad data 
to guide program decisions, evaluation has done a, disservice. 



24 ( i rrclaliMn r- :,rs In the strength »» the relationship hviueni (u» incisures 
A lM K h pottttrc correction mc ins K M1 r „|.h: hcuih: huh >,n ,>ric measure jiso store 
high on the oilu-r \ low lorrvlaiiou me.ms th.it kno.\my .1 i>cr>on\ score on one 
mature does not ulucjlc \oiir iniess annul Itis v .re mi the iiher Correlations are 
u.uiillv expressed In , orrvtatton « o. tm tV ttt a Ueunul hetueen -| .mil H u | tu . 
lateil lr rt m peoples turn* on Hh- tun incisures Sinu ihere jk' <e*eral ilillcrenl 
lorrthiion toetticienis. each Uepeniiiiuf on ihe t\|h^ •»! instruments W-me used 
discussion tit hou to perform mrrel.ttions fo ueierm.ne \alitlil\ or rehahilitv is 
oulsule the scope ol this book Ihe t.ir»«s 1 or relation m^Hueim jre lIimussciI in 
most statistics texts lmwe\er Wu midi! also reler to //,*u fa taUuhie Stamtus 
part ot the Program f 'valuatum Kit 




2ov BEST COPY 



3. Creating a Sampling Strategy 

Unless llic |>nigf ? .ii ,«« arc cxunimmg .< U.o.1 ..ml simple. m« ' ^" ^jj 
of* ... colled and tnmstnhv data <» twi r uwiViil »i</ « " " < ' 
course of the entire pm-jram. Wlial is inmo. there .s no need lo cove ^ 
entire spectrum of s.tc participants, events, ...id actml.es in ordc 

produce a compile and credible evaluation. But you will need to decide 
early where the implementation information you do collect will come 
from. Specifically you i.mst plan. 

• Wliere to look 

• Whom to ask or observe 

• When to look-ami / or to sample everv's and tunes 
Where to look 

The first decision concerns how many program sites yon should examine. 
Your answer to iiiis will be largely determined by your choice of measure- 
ment method; a questionnaire, for instance, can icach many more plnccs 
than can an observer. Unless the program is taking place in just i lew 
places, close together, it will probably not be practical or necessary to 
examine implementation at all of them. A representative sample will 
rromde you *ith sufficient information to be abh to develop an accurate 
pcrtrayal of the program, 

Solving the problem of which sites constitute a represe ntative sample 
requires that you first group them according to two sets of characteristics; 

I, Features of the ws that could affect how the program is implc- 
mented-stH.il as size of the population served, geographical location, 
number of years parlicipai.net in the program, amount of eommumiv or 
administrative support for the piognim. level of timdine. i^uicr com- 
mitment to the program, student or staff tr:uts or abilities 

iVanatiuns permitted m the program itself that might make it look 
Querent at different locations such as amount o| muc given t u the 
program per day or week, choice of auricular materials, or om.ssicn ol 
Jomc program compnncr.'s such as a mui.sii'cmeiit system or audiovisual 
nuier.als. ,s 

Relist of such features is long ard unique to each evaluation For vour 
use, choose four or so likely sources ol majoi program d.vcigeiuc 

J™» sites and clasMly liic sites accordingly. Then, based on hou man> 
you think you urn examine, try to randomly choose souk to 

j^j* nt each classification. You can. of course, select some sues lor 

^■ve, perhaps even case. stud) and j pool ol ofhcis lo examine more 

You may also, for public relations reasons, need to at least make 
an appearance at every program site. In any case, make certain 
that you will be allowed access to every site you will need to 
visit. Such access should be assured before you begin collecting 
data. 



IS- Where possible, including a lew compjraMc site* v. Inch have not installed the 
jy**" 1 it all will gjvc >iHi j h:\sis lor interpreting *'»mv «»i lite daia mil celled this 
Jr^P you detenni* \ lor iiisMikc. whether the absentee rale in .he program is 
JS^I °» how much added rllnrl in iei|in cd ln»m msi.:i-,tuis can luthei 
• ^parson daij hv monitoring or asking .iboui usual practice .it the program 
Q ^«ii»efofeltwaiinit!ucd 



FR£. \ 208 BEST COPY 



Whom to ask or observe 

Regardless of the size of the program or how many sites your implementa- 
tion evaluation readies, you will eventually have to talk with, question, or 
observe people. In must cases, these will be people both within t lie 
program-the participants whose behavior it directs- and those outside- 
parents, administrators, contributors to its context Answers to questions 
about whether to sample people depend, as with your choice of sites, on 
the measurement method you will use and youi time and resources. 

Whom you approach for information also depends on the willingness of 
people to cooperate, since implementation evaluation nearly always in- 
trudes on the program or consumes some staff time. If you plan to use 
questionnaires, short interviews, or observations thai are either infrequent 
or of short duration, then vou probably can select people randomly. In 
these cases, 

applying the clout factor by 

ha\;ng a person in authority introduce you and explain your 
purpose will facilitate cooperation. 

If you intend to administer questionnaires or interviews for other 
purposes, perhaps to measure people's attitudes, you may be able to insert 
a fo»w implementation questions into these. It is often possible^ an^good 
pra&e* to consolidate instruments. 

At times your measurement will require a good deal of cooperation. 
Tins is the cas? with requests for record-keeping systems that require 
continuous maintenance; intensive observation, either systemacic or re- 
sponsive/naturalistic; and questijnnaires and interviews given periodically 
over time to the same people. If data collection requires considerable 
effort from the staff, and you have too little authority to back your 
requests, then you should probably ask for voluntary participants- Possible 
bias from volunteensm can be checked through short questionnaires to i 
random sample of other staff members. The advantage of gathering infor- 
mation from people willing to cooperate is that you will be able to report 
a complete picture of the program. 

Exactly which people should you question or observe 7 Answers to this 
will vary, but here are some pointers: 

• Ask people, of course, who arc likely to know-key staff members 
and planners. If you think that these people might gi^e you a 
distorted view, your audience will likely think so too. Thus you 
should back up what official spokespersons tell you by observing* 
asking others. % m r « ' r * 

• Some of the others should be students ir possible. Good information 
also comes from support stalf members, assistants, aides, tutor*, 
student teachers, secretaries, parents. People in these roles see 
'.east part of the program in opeiation every day but they arc 
likely to know what it is supposed to look like officially. 



♦ Ask people to nominate the individuals who are in the 
best position to tell you the "truth" about the program. 
When the same names are mentioned by several program 
people, you know that you should carefully consider the 
information they provide. 



203 BEST COPY 



3-m 



If you intend to observe or 'aik to people several different times over 
the course of the program, then choice of respondents will be partially 
dependent on your tune frame Choosing which tunes and events to 
measure is discussed in the next section. 

"^en to look 

Hme will be important to your sampling plan if your answer to any of 
diese questions is yes: 

• Docs the program have phases or units that your implementation 
study needs to describe separately 7 

• Dc you wish to look at the program periodically in order to monitor 
whether program implementation is on schedule 9 

• Do you intend to collect dat? from any individual site more than 
once 7 

• Do you have reason to believe that the program will change over the 
course of the evaluation 7 

• If so, do you want to write a profile of the program throughout its 
whole history that describes how it evolved or changed 7 

In these situations, you will probably have to sample data collection dates. 
First, divide the time span of the program into crucial segments, such as 
beginning, middle, and end; first week, eighth week, thirteenth week; or 
Work Umts I, 3, and 6. Then decide if you will request information from 
the same sample of people, at each time period or whether you wkll set up 
Afferent sample each time. 

If and when you sample, be sure to return to the pool the sites or staff 
numbers selected to provide data during one particular time segment so 
«at they might be chosen again during a subsequent lime segment. People 
l« sites) should not be eliminated from the pool because they have 
•ready provided data. Only when you sample from the entire group can 
youclaim that your information is representative of the entire group. 

Timing of data collection needs additional adjustment for each mea- 
^lement method. Questionnaires and interviews that ask about typical 
njehee can be administered at any time during th period sampled. Some 
w«runic n t$ t though, will make it necessary to carefully select or sample 
jocular occasions. You want your observations, for instance, to record 
jJ*«Piograrn events transpiring over the course of a typical program 
Jt # 'ecords you collect should not come from z period when atypical 
ks * a T*! lch as 8 bu5 s ri^ c or U epidemic -are affecting Jhe program or 
^•'ticipants. Sampling oC s|*ecific occasions days, weeks, or possibly 
nours--wili be nectary, as well, if you plan to distribute selfrcport 
M». rtt * hlch ask respondents to report about what they did "today" or 
* "Pacific time. 



2i " BEST copy 



3-/S 



ERIC 



t 

Figure 2 demonstrates how selection of sites, pooolc, a id tunes can be 
combined to produce a sampling plan lor data lollcchcii. In Figure 2, a 
district office evaluator has selected site*. w r >/;/<> (roles), and times in order 
to observe a reading piogram in session The sampling method is useful 
because, in essence, the cvchiator wants to "pud" representative events 
randomly from the ongoing life of the program. Her strategy is to con- 
struct an implementation description from short visits to each of the four 
schools taking part m the program. 

FigJre 2 is an example of an extensive sampling strategy; the evaluator 
chose to look a little at a lot of places. Sampling can be wtensive as 
well-it can look a lot at a lew places or people. In such a situation, data 
from a lew sites, classrooms, or students can be assumed to mirror that of 
the whole group. Il the set of ^ites or students is relatively homogeneous, 
that is, alike in most chaiactenstics that will a r fect how the prelum is 
implemented, you can randomly select representatives and collect as much 
data as possible from them exclusively. If the program will reach hetero- 
geneous sites, classrooms, groups of students, etc., then you should sciect a 
representative sample from cacti category addressed by the program -for 
instance, schools in middle cla<s versus schools in poorer areas: or fifth 
grades with delinquency-prone vcisus fifth grades with a\crage students. 
Then examine data Irom each of these representatives. The strategy of 
looking intensively at a few places or people is almost always a good idea 
whether or no; you use extensive sampling as well These intensive studies 
could almost be called case studies, ?\ccpt that most case study method- 
ologists disavow the need to ensure representativeness. 

Planning Data Sugary and Analysis 

This section is intended to help you consolidate the data you * 

collect regardless of your evaluation's purpose or intended 

0 ^ 

outcomes t, . rt . ?' . . ■ - ' 

mere are two putfihk purposes for U n implementation studv Tlie first 
and major one is. of course, to deswihv the program and perhaps pommcnt 
about how well it matches what wa< intended. A second purpose is to 
examine relationships between program characteristics and outcomes oi 
among diilcrent .ispc.ts of the program's implementation. Examining 
relationships means o-plonng usually statistically -the hypothesis on 
which the program is based. '\ smallei dasws achieve more? Are periodic 
planning meetings related to siajj morale? 

It may seem odd to be concerned about how you will summarize the 
drta at a point where you have barely decided what questions to ask. tfut 
it is time-consuming to extract information from a pile of implementation 
instruments and record, examine, summarize, and interpret it. Thinking 
about the data summary sheet in advance will encourage you to eliminate 
unnecessary questions and make sure you are seeking answers at the 
ippropriate level of detail for your needs. 

f* To handle data efficiently, you should prepare a data summary sheet 
I for each measurement instrument you use-if possible, at the time you 
.design the instrument. Drte summary sheets will help you interpret the 
backup data you have collected and support your narrative presentation 
because they assist you in searching for patterns of responses that allow 
you to characterize the program. They also assist you in doing calculations 
with your data, should you need to do so. 

211 BEST COPY 



Tie following Fection has four parts: 

• A description of the use of data swnmaiv sheer* lor collecting 
together iteni-by-item results Irom quoslionn aires, interviews, or 
observation sheets, pages d7 to >4. 

• Directions for reducing a large number ol narutivc documents, such 
as diaries/ or responses to open-ended questionnaires or interviews 
into a shorter but representative narrative form, pages \I and 73- 



• Directions for categorizing a large numbci of narrative documents so 
that they can be summarized in quantitative form, page J*. 

• Suggestions for analyzing and reporting quantitative implementation 
data, pages 14 to>7. 



toparing a data summary sheet for scoring by liand or by computer 

/ 

A data summary sheet requires that you have either closed-response 
data or data that have been categorized and coded. Gosed-response data 
include item results from structured observation instruments, interviews, 
on questionnaires. These instruments produce tallies or numbers. If, on the 
olher hand, you nave item results that are narrative in form, as *rom 
open-ended questions on a questionnaire, interview, or naturalistic obser- 
vation report, then you will first have to categorize and code these 
responses if you wish to use a data summary sheet. Suggestions for coding 
optn-response data appear on page 73. 

The first part of the following discussion on the use of summary sheets 
deals with recording and anal>zing by hand; the latter part deals with 
summary sheets for machine scoring and computer ana!> sis. 



*|*n scoring by hand, you can choose between two ways of summarizing 
^data: the quick-tally sheet and the people-item mster. 
A quick-tally sheet display nil response options for each item so that 
number of times each option was chosen can be tallied, as in the 
Samples on page 68. 
The quick-tally sheet allows you to calculate two descriptive statistics 
<*ch group whose answers are tallied. ( L) the number or percent of 
wfio answered each item a certain wuy, and (2) the average 
£W»e to each item (with standard deviation) in cases where an average 
■■appropriate summary. Notice that with a quick-tally si, jet, you 
the individual person. That ;s, you no longei have access to 
CJ. res P° r,sc patterns. That is perfectly acceptable if all you want to 
k° w many (or what percentage of the total group) responded in a 



Often, tor data sunnnan reasons or to adudatc eoirvlatums, you will 
need to know about the icsponsc patterns ol individual* within the group. 
In these cases, a people-item duta rostir will preserve that information. On 
i people-item data roster the items are listed across the top of the page. 
The people (or classrooms, program sites, etc.) arc liMed m a vertical 
rotumn on the left. They arc usually identified b> number. Graph paper, 
or the kind of paper used lor computer programming, is useful foi 
constructing these data rosters, e\en when the data are to be processed by 
hand rather than by computer. The pcople-itcni data roster below shows 
vie result* recorded from the tilled in claSMOom obser\«tion response form 
jut precedes it 




212 



BEST COPY 



3-t'J 



INSERT XEROXED PORTIONS OF PAGES 70-71 



213 



How to Summarize Large fluaber of Written Reports Into 
Shorter narrative Form 

If you have to sucaartse answers to open response 
questionnaire items, diary or journal entries, unstructured 
interviews, or narrative reports of any *;ort, you will want a 
systematic way to do this. The following list, adapted to your 
own needs, should help you to design your own system for 
analysis . 

1. Begin by writing an identification number on each separate 
data source (e.g., each questionnaire, each journal). If 
used properly, these numbers will always enable you to return 
to the original if need be to check its exact wording. 

2. If you have sufficient time, read quickly through the 
materials you are trying to summarize, looking for major 
themes, categories, and issues, as well as critical incidents 
and particularly expressive quotations. Mark all of these in 
pencil, 

3. If you will analyze the data iy hand, obtain several sheets 
of plain paper to use as tally sheets. Divide each paper 

into about four cells by drawing lines. 



If you have access to a microcomputer and can type well enough, 
consider uping it instead of paper so you will avoid having 
to copy data by hand. 



21 ^ BEST COPY 



3-/? 



h. Select one of the reports, and look for the kinds of events 
or situations it describes or, if you completed step 2, for 
evidence of any major categories you have already determined. 
As soon as an event is described, write a short nummary of it 
in a cell on one of the tally sheets. You may also wish to 
copy an exact quotation if it is particularly well worded. 
Be sure to include the IE number of the report in parentheses 
following the summary so that if you should need to return to 
the original you will not have to go through all of the 
reports to find it. Then, in one corner of the cell, tally a 
. "l to indicate that that statement has been made in one 
report . 

As you read the rest of the report , every time you come upon a 

previously unmentioned event, summarize it in a cell and give 
it a single tally for having appeared in one report. When 
you have read through the entire report, put a checkmark or 
other mark on it to indicate that you have finished ^ith it. 
If you are summarizing open-ended questionnaire results and 
having 30 or fewer respondents, you might want to copy the 
responses to each item in order to put in one place the 
specific answers to a given question. 



Read the rest ol the reports m an> ordei. Rccoid new statements js 
jbove. When \ou come upon "lie that sm;is /" tunc fnvn maittunvd m 
a previovs report, find the edl lh.it sunimaii/cs u. Read carefully, 
making sine that it is moie ui lev* i ho ^jmo kind ol event. Record 
another *T m the cell to show thai it has heen mentioned in another 
report. It* some part o' an event oi opinion dilleis substantially Irom or 
adds a signrlkanl element to the iirst wuic a vtatement thai covets this 
different aspect in another cell so that >ou may tall> the mtmbci ol 
reports in which this new element appears,, 
fe Prepare summaries ol the most Juufiwnt statements lor inclusion in 
your re poii. There may he point i cabins loi iccordmg separate! \ data 
9 from diffeicnt groups il the reporters laced urcumsta*nccs lhat woulJ 
predictaoly bring about dillercnt lesults (e.g.. dilfcrenl grade levels, 
diflercnt program variat"«ns) Aiso.il the qnaniih of the data that you 
arc glcanmg Irom the reports appears to be unwieldy. >ou may find it 
ncccssar) to organize ''ic cxcnls mentioned into different categories -4U 
some cases nioic general, in others more naimw. Whenever ncwi sum- 
mary categories are lorincd. however. \oti j cautioned to avoid the 
blunder ol Irving to translci previous tallies horn the ongmal cate- 
gories The only sale piuiciliirc is to lelmn to the original mhikc. Itie 
reports themselves, .ind then hlly results loi Mu- new categories 



BEST COPY 

9 1 r 



How to summarize a larj:e number of written reports by categorizing 

The following procedure hclns you to assign numerical values lodillercnl 
types of responses and use tlwfiiaia in further statistical analyses. Suppose, 
for example, you asked 100 teachers to describe their experiences at a 
Teacher Lcanr % Center where they received in-service training in class- 
room management techniques. After readirg ? iioir reports and summa- 
rizing them for reporting in paragraph form, von wonder how closely the 
practice, of the Teacher Center conform to the oIUluI description of the 
instruction it otters. You can find this >uit h\ latvjionztng teachers* 
reports into, say. five degrees of closeness lo official Teacher Center 
descriptions -very close, through so-so. to downright contradictory giving 
each leather an opinion score, I ilnough 5 Such i.iiik-ordci data will give 
you a quantitative sumnurv ot leather' jspciieikes ot the program. 
Perhaps vou could then correlate this with then liking lor the program or 
their achievement in courses. 

The difficulty of the task of categorizing open-response data will var\ 
from one situation to another. Precise instructions toi arriving at your 
categoric" and summarm ig voiir data cannot he Provided, but the follow- 
ing advice should help make the task num manageable 

I Think ot a dmwnu.m akmg which program impicmenlahon might 
vary -closeness of fit lo the program plan, peihaps. or approximation to 
a theory or effectiveness ol instruction. The dimension >ou choose 
should characters the kinds o| rcpoiis given to \m\ so that you can 
put them in urricr from desirable to undesirable 

2. Read what vcj consider to be a repie<enialivc sampling ol the data 
about 25" h-tcrminc it :t is possible io begin with three general 
categories (a) clcarlv desirable, (b) clearh undesirable, and (c) those in 
between. 

3. if the data can be divided in these tluce piles, you tan then put aside 
for the moment those in categories (a) and (b) and proceed lo refine 
category (c) In dividing U into three piles 

• Those that arc more desirable than undesirable 

• Those that are more undesirable than desirable 

• Those in between 

4- Refine caicgoncs (a) and (b) as you did U ) II vou cannot divide them 
into three gradations along the dimension vou have chosen Then use 
two: or if the initial bieakdown seems as !ar ts vou can no leave »l a* is 

5. Have one or more people Jieck >oiu caiei»oiics This cmii be done by 
asking others to go ilnough a stimkn <. aUgon/aum: process oi to 
critique the cJteyonus and the sclet tions wmi have made 



2I « BEST copy 



Some suggestions for analyzing and reporting quantitative implementation 
data 

Computing results characteristic by characteristic. It \ou wan I to report 
quantitative mfoimation Iroin \om implcinenia'ion instalments, this sec- 
tion is designed to help you. It a^umes that you havo first translerred data 
to a snnin;ar\ sheet. 

Your implementation data depict the frequency, dm at ion, or lorm of 
critical characteristics ol the program. Hum want to explore relationships 
between certain progiam characteristics and others, or between piogram 
features and achievement or altitude outcomes of the program, then sou 
•van! to make statements on the nature ol "Programs which had charac- 
teristic K tended to J " Heie K in j dcsciiptinn ol the frequence or form ol 
a particular program feature, and J is an achievement the altitude m a 
particulai gioup 01 peihaps the ti. quern. v 'jr form ol vet another niograni 
feataie V* hi inicht loi instance .vain id sec whether piourains with more 
than two aides m the classioom show higher sialf morale or perhaps 
whether c\|>enence-based iiiuh sehool vocational piogrjiits with a wide 
choice ol wink ' J\ plans h.ive fewer JiopiHiK 

Showmu ihh lelaiiMiislnp can K il* »: n* in t\\«» wa\ - 

• You l.i.'i use itMiumcni ies'ili< <K) to classilv piograms and then 
calculate 'he avewigc J pei piM»ram o» 

• You can correlate K with J 

Before you hot her to compute a statistic. >ou should be cleai about the 
question vou aie Irving to aiiMvc; and consider who would be inleicMed 
in the answer and what impact it might lia*.e 

If you decide to explore relationships ol this sort. sou have two choice* 
about what to use tor K (and J. it it is auotlici program feature) 

I K can be a Nummary ol responses to a wmie item. It could lie. lor 
instance, a dassilkalion ol schools b> lunding level ol llie program. 01 
the average number ol participating classrooms at a site It could be the 
number ol parent volunteer, the numlvi ol years tlie piogram has been 
m operation. 01 oh>ei\ers' estimate ol tlie average amount of time spent 
at 4 particulai acliwU It vou use a Miiglc item to deie-nnine tht* 
classification tlien make sure that the item gives valid and reliable 

information. Hie probability of making an error when answering one 
item is usually so large that people might be skeptical. If you must use a 
single item to indicate K, then make sure \ou can verity whaf the item 
tells you. If the classification according to program characteristics 
which gives you K is critical to the evaluation* you should probably use 
multiple measures or an index to estimate K. 
1 You can calculate an index to represent K by combining the results et 
several items or several different implementation measures. A procedure 
thit asks about slightly different aspects of the same characteristic 
several times, and then combines the results ol these questions to 
indicate the picsence. absence, or form ol llie chaiactcristic. is less 
likely to be affected by the random error that plagues single questions. 
An index, therefore, is a more reliable estimate ol K than the results of 
a angle item. 18 



ERLC 



18 A quick >va\ 'n compute an iiiiIcn is to add or average the results I rem several 
items or instruments To produce ;i more credible and. i here! ore. useful mslrumeui. 
H U i good idea to Hem analyse ihc dillercnt questions or ii.strumeiil results which 
contribute to the intk^ The method lor dome this is similar 'o lhat tor eonsiructine 
* «ttitude rating scale. Directions lor tctnpuiint! indices and developing attitude 
«Ung scales can !>e tunnel m lleiicrson. M I . Morris. I I . & I itM.ihbon. ( I 
How to measure attitude*. In I. L. Morns (Id I. l+mrum ctuiualum ktl Ik-serlv 
H *: Sa*c Publications. 1978 



217 



BEST COPY 



if a program plan^or perhaps j theor) . Has guided votir examination at 
Die program, then a partieularlv useful mde\ lot %iiminjri/ing uhii ImJ- 
tngs at each site nugiit be an estimate «»l dcyve nt implementation How 
you calculate such an index will \ary with tiie MMtmg. You would, 
however, select a set of 11k piugiam b lew most critical Jiaiactcnsucs. and 
tJien compute the index from judgments of how Joselj the program 
depicted b> the data Irom one or moie insiiuments has put these into 
operation. The simplest index of degree ol implementation would result 
from a checklist on which obseners vote presence or absence c f important 
program components. The index would equal the number of present boxes 
checked. 



Computing results for item by item interpretation. In addition to. or 
instead of. drawing relationships iii \oui data, \ou may simply want to 
feport results Irom your implementation instruments item by item. Theie 
we myriad ways to summari/e and displa\ this kind ol data. Most ol these 
ire beyond the scope ol this bonk, and you should consult a book on data 
analysis and reporting lor more detailed suggestions. ,w 



For the purpose ol summarizing responses to uuliwdual items, you 
might want to present totals, pciccnt.igcs or uioup .ivciagcs In sonic 
instances, compulation wi!. involve noihing moie than adding tallies^ 



H\amptc. O! the 50 Unlit-. -n no C tewed I" »*o\< ami 13 iMrls reported 
having taken part in the alicr-MJiool recreation program. These 32 
children reported liavine cnirnecd in the inllowinu aihvities 







bo\ *. 


%ij 1 s 


total 


handball 




1 0 


7 


2b 


bars and rings 




l« 


12 


28 


team garnet (bi<*eb<i 1 I , U» 


. * il 1 > 


i : 


m 


~i 7 


handt( m r t^ 




f i: 






i bes<? 




u 




U> 


rhecker^ 




1 ' 1 


8 


18 



m particular. I MM, ihhnn. ( I & Minns, I I Now to cakiilatv 
JrJ" t,c,: Mnirw. I i . ;1 I it.'-dihhon. ( I Mow i». prepare , in valuation report 
**"on. M | ( \|o m s. I I , & | tt/ CohhoiK < I Hnu In measure lOituiks In 
L- Norm (Id ufr»">dmttaiiMth*i kil H.ueth Ihtlv S.ie*. I'uhlu.iUons. I<>78 



218 



BEST COPY 



3-2.3 



In uflic! jsos iini w.titf to l'mjv 1 1 tin* ii'imK'is i" peucniaizes 



I \ample Obscrurs u«ed a .<hI d licluM» ,r \ cord method »o record 
' .icher Muduit question an^'ve uiiii '.t< I'urinj one work of lab 
prnods in i ln< *i school Jienm'o p'oemn I m.n these extensive 
i oiled behavior records. the •.•valuiti" .is jole • « find 503 teacher 
vudent m"*«inin a'isw u-u, is I he iltumr v.la<siticd thcs<' accord- 
ing to thv. follow mi: cone 



teacher q .n|. s a question 

s stiuii.nl r 'jibe's t response 

o *jv s iifti limiz 

According to this uhJc tq-sr-tq mean', tlut ,i teacher asked a 
question, ,i student i»ao.' a response, md thr leather asked another 
question Accordingly, different sort* <»| conversation patterns, plus 
their relative frequencies, could he broken down . follows 

Iff-si-ttf {(f-st-tr J Ufst'tn • \n tr \q-frt other 

90<I8' > ^5 (9 I 42 »8' l inli^O' i S0M6 i 34l7M l!0(22 r ^ 

!t was noted thrt the Irctjii- no »i ickIm questiumni' liter student 
rt\pons< w.is rclattvvlv huh lln\ wis a desirable behavior that the 
|H< rain h ul Sought to tost« f 



If questions on the instrument demand .inswcrs that represent a pro- 
gression t \oti mav wish to report an average answer to the question. 

Wh.it percent of the period did the te.icher spend on disci- 
pline 7 

| 1 virtually none Q about 75* # 

Q about 25% - QJ nearly all the time 

fl close to half 

Averages can be gnphed. displayed and used »i furthei data analyses. Be 
careful, however, to assure v ourself th it 'he average is mil\ representative 
of the resoonses that nui received 1 1 ■ ui net ice 'hat responses to a 
parueulai question pile up at two ends <>t » he \ on tin mini then the answers 
seem to he iwhtrtzia ai.d aveiagcs mil hh he representative. To report 
such a res till b\ an avenue would be misleading to the audietice 



BEST COPY 



2l'J 



Whether or not you become embroiled in reporting means and percent- 
ages an43SPkmg for rditonihipt, you will probably have to use the data 
you collect to underpin a program description. Program descriptions are 
usually presented as narrative accounts or descriptive tables such as Table 
3, page 59 or Table <<, below. 

TABLE U 

Project M->nltnrlnR — Act tv it if* 



Objective ( j V Febmarv 2$. I9\"i. each part lc ipse ing 
school wilt implement, evaluate result*, an.t mko 
revisions in a program for the e*t.%Mi*hinf»nt of ■! 



Winons School District 



posi 


;ive cunate ior n-irninR. 












19YY 








Activities for this objective 


Sep 


Oct 


Nov 


Dec 


Jan 


r-b 




Apr 


May 


Jun 


6.1 


Identify stsff to participate 




\i- 


C ; 


















Selected staff mhocm review 
ideas, *osls, ird objectives 




I 


r> 


P 


C,J 












ft. 7 






















6.3 


lden%l' . student need* 




I' 


, I 


P 


c j 
































6.4 


Identify parent needs 




l 


I 


f> 


































6.5 


Identify stsff needs 

Fvsluate date collected in 
6.3 - 6.5 




u 


t l 


}' 


r , 






















On 


Li 


C 






6.6 
























Identify and prioritise specific 
outcome goals sod objectives 






uL- 


U 


P 


P 


Cj 








6.7 
























Identify existtnii policies, pro- 
cedures, and lews deal ins, with 
positive school cliaate 




u 




P 


P 


L 










6.8 


1 








1 













Evsluator's Periodic Progress Rating- 

I - Activity Initiated f - Satisfactory Pm«re«s 

C ■ Activity Completed U * Unsat uractorv Progress 



TabU 4 best suits interim formative reports concerned with how faithfully 
the program's actual schedule of implementation conforms to what was 
originally planned. A formative evaluator can use this table to report, f«f 
instance, the results of monthly site visits to both the program diiector 
and the staff at each location. Each brief interim report consists of a table, 
plus accompanying comments explaining why ratings of **U," unsatisfac* 
tory implementation, have been assigned. 



16. This tabic has been adapted from a foimatnc monitoring procedure 
vclopcd by Marvin C Alkin. 



dr 



ERLC 



220 



BEST COPY 



TABLE 2 



Methods for Collecting ftrinp Data 



Advantages 



Method I: Examine Records. 

mZVlVZ T*™ aCC ° UnU ° f rCgU,ai trances con- 
asting ol such things as attendance and enrollment reports, 
agn-.n sheets. Hour) checkout records, permission 
sups, counselor rlU teacher logs, individual siudeni 
assignment cards, etc 



Method J: Conduct Observations. 

^iwtf/iuiu rcqiM-. ,lu. one ..'more ubsc. vers devote 
Jl nctr attention to Hie behavior « an individual , , r 
«' »up withii, j nasu-j| sctimc and iur a presenhed umc 
pcnod In some t JM ?. an observer may be *ivcn Jctailcu 
juiuchnes about a, , or what to observe, when ,nd 
how lona (t> ..^c-m ma (Ik mcthuj or recording the 
m; >r,nauon \n . .s»rumeni u> record huktnd oi 
inclination . j.kclv be i..rmatied as^ uuestiun- 
tuirc or ullv *lvct \n ubseivcr mav also he sent into 
j Jjisroum with .e*s res:n> "..c mstruc.ions. i,e . witho^ 
oVtjiicil jiuidchi.es. and bimpl> asked to write a 
nwiiifi,i*c.n w -. ir , nilK Jtuulnt ol eienr *4ksT. burred 
within the pr ^n\\ time period 



Disadvantages 



Method X l'w Sell- Report Measures. 



Qiuuumnmn, .»ie instruments that present information 
l»a r.spmid.nt n: H illp „, or through the use o| pictures 
and then rcui.r, a u ., IU1 resp..nse a thcik. a nrJe 
J w<»rd a suit, m.v ,e\aal sentence* . 




Inter* ten invohca •a.c h.-luce meetiniheiwSn^T, lor 
rnorei persons in uhich 4 respondent answi rs questions 
posed b> an interviewer T questions »r„v he pre- 
determined, but the mu. .wcr it tree r ( * pursue interest- 
ing responses The respondent's answers arc usually 
recorded in some w JN hi the interviewer d urine I lie 
interview, hm a summary oi the responses is isiMcMlly 
•ompleicd alterwards 



E™£T COPY 



•Records kept tor purposes other than 
the program evaluation can he a source 
of data gathered without additional 
demands on people's time and energies 

•Records are ottcn viewed as objective 
and therefore credible. 

•Records set down events u t the time of 
occurrence rather than in retrospect. 
This also increases credibilitv 



• Records may be incomplete. 
•The process of examining them and 

extracting relevant information 
can be time-consuming. 
•There may be ethical or legal con- 
strain is involved in your examina- 
tion of certain kinds ot records 
-counselor files, for example. 

• \skmg people to keep records 
specitleallv for the program eval- 
uation may be seen as burdensome 



•Observation can iv Inehiv . rcdiWe 
wncn seen .is the rcpor ot what actu- 
al!) took place presented bv disinter- 
ested outsider!*) 

•Observers provue a point ot view di:- 
ferent from ;iut ot people m,.sr Joselv 
^•nnewteil \« ■ U :h c pr. _r.m: 



•'.Wstionnaires provide "he inswers 

'oa varietv ot question* 
•Thcv can be mswercd ir»in n*>.uOv 
•Thev allow the respondent tune to 

think beiore rcspondini: 
•They can he _iv c .t io man> people. 

at distant sites, sinmltancoush 
°Thev ean be mailed 
•They impose uiiilonuiiv on the 

miormation obtained In iskin- al! 

respondents the same things, e u 

asking teachers to supplv the names 

of all math gjmes used ip class 

throughout the semester 



Interviews can he used to obtain in- 
formation Irom people who cannot 
read and from non-native speakers 
who mi K*t have difficulties with the 
wording ot written questions 
"«icrviews p crm „ flexibility They 
Wawt'ic r tcwer to pursue unan- 
"^Pated lines ot inquiry 



•The presence ot observers mat 
alter what takes place 

•Time is needed to develop the ob« 
servation instrument and train 
observers 11* the observation is 
hignh prescribed 

•I* is necjssar} to ioiaiu ccdmle 
observers if the observation i> 
not card ulh controlled. 

•Tune is needed to conduct ^illi- 
cit., numbers ot observations 

•There arc u^w^scheduline 
prohlems 



•Thev do not provide the flcxihil- 
U> ot inters tews 

* People -re Men hclter able to ex- 
press fhetiKclv.-s orallv than in 
w mine 

•Persuadme people to complete and 
return questionnaires is sometimes 
dlltic 11 It 



... ~~ C*n ba. 

*f titer sic wine ts tune-consuming and hard 
•Some in nes the interviewer can A ^Aed 

unduly mlluence the responses •>! 

the interviewee 



rroftraa Ex-f> XI 



TABLF. 3 

r mplfffiont,it Ion Detcrlpdc 



ffrionrv spon- 
sible for 
implementat ion 



Teacher 



Target 

group 



Stu- 
dents 



Teacher/Aide 



Re-ding spe- 
cialist/ 
teacher, stu- 
dent ttitors 



Principal 



Stu- 
dents 



Stu- 
dent* 



Par- 
enes 



Activity 
Vocabulary dril 1 
and game* 



Language experi- 
ence activities 
— kcepirg a 
diary, writing 
storiei 



Peer tutor ing 
wi chin %. lass , 
in readers and 
workbook* 



Outreach— inform 
parents of prog- 
ress; encourage 
at-home work in 
Urban Children 



test's; hold two 
Parents 1 nights; 
periodic confer- 
ence? 



t in Is 



SMA WCn . « || ,h , 

3rd 4 4th level 



Tencher-dcve1(»pe<i 
word cards, vorah* 
ii I jrv 
OldjUid 



Student notebooks 
prlmai v ,ind < lit,, 
typewriters 



United States 
Ho ik Company 
Urban Chi Idren 



reading scries 
and workbooks 



«'i R. in lr.it Ion 
for act Ivi ty 



Sm 1 1 I groups 
(based on CTBA 
/oc.ihulary 
score*) 
Same 



Sntm«_ 

Individual 



Student 

tutor ing dy.ids 



Al ) parents for 
program come to 
Barents* Might; 
ether i on tact 
with pnrcnts 
tin individual 
basis 



Pronraw Ccmpon«nt : 

4ch Grade Reading Comprehen- 

sion — Remedial Act lvl ties 



Frequency/ 
durat Ion 



O.iily, 15-20 
m inutes 



Same 



Produc t ions 
thc< ked weekly 
(Fridavs) ; stu- 
dents work at 
self-selected 
t imes or at home 



Monday through 
Thursday, 20-30 
minutes 



Two Parents* 
N'ghts— Nov. and 
Mar. ; 3 written 
progress reports 

Si* D<*c . , Apr. , 
June ; other con- 
l.irl. with parents 
ad hoc 



Amount of progress 
expected 



Completion of SMA, 
Level 4, by ill 
students 

Nona spec If lad 



None specified 



Completion of at 
least one 2^-paga 
notebook by each 
child; SOX of «tu 
dents judged by 
teacher or aide as 
'making progress " 



Completion of 1+ 
grade levels by 
602 of students 



BEST COPY 

222 



3 




Mdtcrvicw > i t »tei h^v) k|js>rui>m i<|iu«\tiuti- 
oKcrv.iti<«n) n.nres) 
ROLFS (and instnnmnK> 



Figure 2. Cubes depicling a sampling plan lor mea- 
suring implementation ol a middle-grades leading pro- 
grain in four schoo. within a district. Tlic large 
4x4x3 cube shows the overall data-collection plan % 
from which sample cells may be drawn. The smaller 
cube shows selection of a random sample (shaded 
segments) of classrooms ami reading periods chosen 
at Howard School for ohsenution during a 3-day 
February site visit 



BEST COPY 

223 



3 



Example of a people-item data roster 

Observation Response Form (results trom classroom i > 



.nentat ion U'.jective* S+u- 










will direct an-" ncnitrr 
t.ie.r own progress in nath 
act. ! ties. 

Our. -3 die math period: 


/ - 








I. Students -rorVrd on <ndivi- 
jal math ass.^nnt nt<;. 






'A 


I) 


2 : :»idents asked for nelp 
-.th finding materials t" 
- ~ rk on. 




.] 


\ 


*T 


1. c tudents loitered about, 
-?rking at no activity in 
; articular. 


u 




\ 


'-1 ' 


4. Students used sel f - test Ing 
«-eets . 


31 




2I 


«i 


*>. 7 tudents sought out -Tide 










— * ^U^C-^ S^StlOR ^ 1 





Summary Sheet (people-item format) 





T ten 
1 


I torn 
2 


1 tem 

3 


U 


1 torn 


h 


flTS=T^ni 1 


-3 




U 


3 






Lliss-:om 2 














Ulas?r^om 3 















I'xamples of quick-tally sheets 
Questionnaire 



uncor- 
v cs no t a i n 

□ n □ 

□ □ □ 



1. Were Che materials available 
when you needed them? 

2. Were the materials suitable 
for vour students 



Stimm.ir) Sheet (quick-tall) lormat) 







no 


uncertain 


1 


IK !'!• 




il 


2 


t 







etc 

< il)scr%.»lnm InMrumetil 



I 1 1 In |M i t ' 

: usin, 1 



\\ t r. prmjp Interaction v<m< 
« 1 »■ provide 1 . 



1 



I in. 

•« ,rnnn u,'» - '<*}, the workinfi 

f nn« ! ; >n« c' 
1 >vn*' ti"<i - ill ncnhTs in 
' - I'm) luy t •» . r- ip Manning. 



.J 



Sitmm.ir\ Sheet <<|Mitk t. tilt format) 





1 




■insnt ■ s- 


111 .1 


f «t( U>r\ 


! 


! 













5 

out - 




so- -»<) 




^standlfj 






• 






« I j 1 VI 

1 





o 

ERIC 



BEST COPY 



Chanter k 



METHODS FOR MEASURING PROGRAM IMPLEMENTATION: PROGRAM RECORDS 
An historian studying the activities of the past relies in 
large part on primary sources, documents created at the time in 
question that, taken together, allow the scholar to recreate a 
developmental picture of what happened. Evaluators, too, can 
take advantage of the historian's methods by using a program's 
records — the tangible remains of program occurrences--to 
construct a credible portrait of what has gone on in the program. 
Unobtrusive measures, methods of data-collection that, because 
they are ongoing or require little effort on any one person's 
part, can provide valuable information concerning program 
implementation. Consider the list of commonly kept records given 
in Table 5. Any of these could be used to develop your 
description of a program implementation, although you will most 
likely find the clearest overall picture of the program in those 
records that program staff have kept systematically on an ongoing 
bas is • 

If you want to measure program implementation by means of records, 
consider two things: 

• How can you make good use of existing records? 

• Can you set up a record-keeping system that will give you needed 
information without burdening the staff? 

Where records are already being kept, you can use them as a source of 
information about the activities they are intended to record Since the 
progress charts, attendance records, enrollment forms, and the like kept 
for the program will seldom cover all you need to know, thBi#£,'yoii 
nvght <t ttx to arrange for the staff or students to maintain additional 
d oSi imcnM . Of course, you will be able to set up record-keeping only if 
your evaluation begin* early enough during program implementation to 
allow for an accurate picture of what has occurred. 

In most cases, it isftt&galistic to expect that the staff will keep records 
over the course of the program solely to help vow gather implementation 
information, unless these records are easy to maintain (eg., parent-aide 
sign-in sheets) or are useful for their own purposes as well. You will do 
best if you come up with a valid reason why the staff should keep records/- 
and attempt to jlign your information needs with theirs. You could, for 
instance, gain access to records by offering a service' 



BEST COPY 



Table 5 . Records Often Produced by Educational Programs 

-Certificates upon completion of activities 
-Completed student workbooks 
-Student assignment sheets 
-Dog-eared and worn textbooks 

-Products produced by students (e.g., dravings, lab reports, 
poems , essays) 

-Attendance and enrollment logs 
-Sign-in and sign-out sheets 
-Progress charts and checklists 
-Unit or end-of-chapter tests 
-Teacher-made tests 

-Circulation files kept on books and other materials 
-Diplomas and transcripts 

-Report cards tV n 
-Letters of recommendation 



\ 



-Activity or field-'trip rosters 

-Letters to and from parents, business persons, the community 
-Letters of recommendation 

-Logs, Journals, and diaries kept by students, teachers, or aides 

-Parental permissior lips 

-In-house memos 

-Flyers announcing meetings 

-Records cf bookstore or cafeteria purchases or sales 
^Legal documents (e.g., licenses, insurance policies, rental 
(agreements , leases ) 

-Bills, purchasing orders, and invoices from commercial firms 

providing goods and services 

-^Minutes or tape-recordings of meetings 

-Newspaper articles, news releases, and photographs 

-Standardized test scores (local and state) 



/ 



BEST COPY 

226' 



For example, by agreeing to write software for a custom-made 

management information system, the evaluator of an 'Adolescent 

parenting center structured ongoing data collection of value b^Wi 

to his clients^ and to any future evaluator. In another instance, 

an evaluator was able to monuoi program implementation at school sues statewide by 

hcipmp schools write the periodic reports thai had to be submitted to the 
Siatc Department of Education 

Implementation Evaluation Based On 
.Already Existing Records 1 - ' « 

v : . - 

The following is a suggested procedure to help you find pertinent informa- 
tion within the p wHram's alread y g m*wg icuunh, and-*«.vt«t*-tfc* 
n*£orniauoji. +^ ' 



Step 1. Construct a program characteristics list 

Compost a list of the materials, activitiesy»n*or administrative procedures 
about wluch^voujice^^ procedure was detailed in 

Chapte r g » pages ajTtotgj ^'V 

Step 2. Find out from the staff or the program director what records have 
been kept and which of these are available for your inspection. 

Be sure you are given a complete listing of every record that the program 
produced, whether or not u was kepi at every s a c . Probe and suggest 
sources that might have been forgotten Draw up a list of al 1 records that 
will be available 10 you. 

. s C ••"">' 

^f part of your task is to show that the program as implemented 
represents a departure from past or common practice, you might include 
records kept before the program. 

Step 3. Match the lists from Steps 1 and 2 

For each type of record, try to find a program feature about which the 
record might give information. Think about whether any particular record 
might yield evidence off H* o^^j ; 

• The duration or frequency of a program activity 

• The form that the activity took, Vivfla't it typically looked like; you 
will find this information only in narrative records such as curricu- 
lum manuals and logs, journals, or diaries kept by the participants 

• The extent of student or other participant involvement in the 
activities-attendance, go H Lm kmm % +4* c^dcr, e>4c'if et-c 



Do not be surprised if you find that few available records will give you the 
information you need. The program staff has maintained records to fit its 
own needs; only sometimes will these overlap with yours. 



227 BEST copy 



Step 4. Prepare a sampling plan for collecting records 

Generai principles for setting up a data collection sampling plan were 
discussed in Chapter^ page \60. Tlie methods described there direct you 
either to sample typical periods or program operation at diverse sites/or to 
look intensively at randomly chosen cases. Were you to use the former 
method for describing, say, a language arts program, you might ask to set 
"library sign-in sheets and circulation files fcr the fall Quarter at Hmtt+f* ^^*' 
Junior High," as well zs for other times and other places, all randomly 
chosen. The latter method directs that /ou fu;us on a few sites in detail, ^ 
An intensive study might cause you to choose fcwUdfas re p resen t alive of J 
participating junior high schools and examine its whole program in addi- 
tion to the library component. You could, as well, find your own way to 
mix the methods. 

If i-art of your p.ogram description task involves showing the extent to 
which the program is a departure from usual practice, you could include in 
the sample sites not receiving the program and use these for comparison. 

Step S, Set up a data collection roster, and plan how you will transfer the 
data from the records you examine 

The data roster for examining records should look like a qucst:onnaire- 
M How many people used the library during this particular time unit 7 " 
"How long did they stay?" "What kinds of books did they check out '" 

Rcsponses^ nt a r ed by your riiti , tnnff a r^ can take the form of tallies or 
answers to multiple choice questions. 

When data collection is complete, you might still have to transfer it 
from the multitude of rosters or questionnaires used in the field* to single 
data summary sheets, described in Chapter 3, pages 67 to 71. 

Step 6, Where you have been able to identify available records pertinent 
| to examining certain program activities, set up a means for obtaining 

access to those records in such a way that you do not inconvenience *he 



J program staff. 



Arrange to pick up the records or copy them, extract the data you need, 
and return them as quickly and with as little fuss as possible. A member of 
the evaluation staff should fill out the data summary sheet, program staff 
J should not be asked to transfer data from records to roster. 

Setting Up a Record-Keeping System 

What follows is a suggested procedure for establishing a 
record-keeping system or what is sometimes called a management 
information system (MIS). With the growing availability of 
computers for even small organizations, program personnel 
increasingly have the capacity of collect and maintain data for 
use on an ongoing basis • While evaluators seldom have the luxury 
of building provisions for their own record-keeping into the 
program itself, they should be prepared to take advantage of +ire 



BEST COPY 



record-keeping primarily to the needs of program staff and 
planners and only secondarily to the needs of the implementation 
evaluation. 

Step 1. Construct a program characteristics list for each 
program you describe. 

Compose a list of the materials, activities, amdtfor 
administrative procedures about which you need supporting data. 
(This procedure was outlined on pages ?? to ??.) If the 
evaluation uses a control group design or if one of your tasks is 
to show that the program represents a departure from usual 
practice in the district, you may need to describe the 
implementation of more than ne program. You should construct a 
separate list of characteristics for each program you describe. 



Attach to your list, if possible, columns iicadcd m the manner of 
Columns 2 and 3, Tabl^page^ This table has been constructed to 
accompany an example illustrating the procedure for setting up a record- 
keeping system. 



Step 2. Find out from the program staff and planners v/hich records will 
be kept during the program as it is currently planned 

Be sure this list includes tests to be given to students, reports to parents, 
^gnment cards-all records that will be produce^ oyer the course of the 
Program. CK/.-l \[, I * of /.-l^ 5" /-v f r 



Step 3. For each program characteristic listed in Step 1, decide 
if a proposed record can provide information that will be both 
useful and sufficient for the evaluation's purposes 



9 

ERIC 



Rnt, examine the list of records that will be available to you. Will any of 
tem be useful as a check of either quantity, quality, regularity of 
occurrence, frequency, or (juration of the program characteristic? If it will, 
***** its name on youi activities chart next to the activity whose occur- 
teoc * it will demonstrate. Jot down a judgment of whether the record as is 
**U fit your needs or whether it might need slight modification. Also enter 
*• number of collections or updatings of the record that will take place 

over the course of the program. If the number of collections seems 
insufficient to give a good picture of the program, talk to the staff to 
request more frr >t updating. 



229 



BEST copy 



1 



Step 4. For those characteristics that are not covered by the 
staff f s list of planned records, decide if simple additions or 
alterations can provide appropriate and adequate evaluation data 

When you have finished your review of records that will be available, 
look closely at the set of program activities about which you still need 
information. These will not be covered by the staffs list of planned 
records. Try to think of ways in which alteration or simple addition to one 
of the records already scheduled for collection might give you information 
on the frequency of occurrence or form of one of the activities on your 
list If it appears that slight alteration of a record will give you the 
information you need, note the name of the record and its planned 
collection frequency and request that the program staff make the change 
you need. 



Step 5. Meet with program staff first to review the planned 
records that will provide data for the evaluation and second to 
recommend changes and additions for their consideration 



Before seriously approaching the staff and asking for 

their assistance w>\h your information collection plan, however, scrutinize 
it as follows. 

• Will it be too time-consum»ng for the staff to fill out regularly? 

• Will the staff members perceive it as useful to theml 

• Can you arrange a feedback system of any sort to give the staff 
useful information based on the records you plan to ask them to 
keep? 

If the information plan you have conceived passes these checkpoints, 
suggest it to the staff. 

Trv to avoid data overload Do not produce a mass of data for which 
there is little use. The way to avoid collecting an unnecessary volume of 
data is to plan data use belorc data collection. 



2;jo 



BEST COPY 



ERIC 



H-1 



0 

Step f. Prepare a sampling plan for collecting records 

Once you know which records will be kept to facilitate your implements 
Uon evaluation, decide where, when, and from whom you will collect 
them. General principles for setting up a data collection sampling plan 
were discussed in Chapter 3, page 60. The methods described there 
produce two types of samples: 

• A sample that selects typical time periods or episodes from the 
program at diverse sites 

• A sample that selects people, classes, schools or other sites, consid- 
ering each case typical of the program 

Your sampling plan could use either or both 



7 

Step y Set up a data collection roster and plan how you will transfer data 
from the records you examine 

Hie data roster for examining records should resemble a questionnaire for 
which answers take the form of tallies or, in some cases, multiple-choice 
items. 



The data roster is a means for making implementation information 
accessible Jo you when you need it so that it can be included in the data 
analysis for your report. The roster, you will notice, compiles information 
from a single source, covering a single tiine period. For the purpose of 
your report, you will usually have to transfer all of the roster data to a 
data summary sheet in order to look at the program as a whole. Chapter 3 
describes data summary sheets, including those for managing data process- 
ing by computer, beginning on page 67. 

Step f.^Set up a means for obtaining easy access to the records you need 

Gainer records from the staff in a way thai minimalK interferes with their 
busy work schedules. You^ or your delegate/ should arrange to collect 
workbooks, reports, checklists, or whatever, photocopy them or extract 
the important data, and return these tecords as quickly as possible Only in 
those rare situations where the staff itseil is ungrudgingly willing tc 
participate in your data collection should you ask them to bring records to 
vou or transfer information to the roster 



23i best copy 




Step 9* Check periodically to make sure that the information you 
have requested from program staff is in fact b^ing recorded 
accurately and completely 

It is one thing J to plan an implementation evaluation thoroughly 



at the beginning of a program. It is another thing altogether 



the records ready for her use. In many cases you may return at 

the end of the year to discover that what you thought program 
staff were going to do in the area of record keeping and what 
they actually did were two different things. If the 
effectiveness of your evaluation relies on records kept by 
program personnel, you are well advised to check periodically to 
make sure that the information, you need is being collected and 
maintained. In the ^ress of program activities, record-keeping 
may become burdensome or, given limited resources, even an 
inappropriate use of staff time. 



A 




say, a year later and actually find 




232 



rxampte. Ms. Gregory, Director of Lvjluation <or a mid-sued school 
district, is intending to evaluate rhe implementation of a/tate-funded 
compensatory education program lor grades K through 3. The program 
uses indiv,dujli/cd instruction. After examining the program proposal 
and discussing the program with various stall members, she has con- 
structcJ the implementation record-keeping chart shown in Table 6. 



TABI£ 6 

Example of an Imptementarion 
Record-Keeping Chart 



Coluan 1 



Activities (3rd trad*) 



1) Earlv f»orning warn-up, group 
exercise (10 a)in./o.iyj 

:) Individualized -eadirR 
(i5 ain /dav) 

\*ch s:jdent : 

a> '.-aJine, lloud w ,th n irhi-r 
J i J' (it l«*»n h,t rk ) 



recorder .enter i J 



-<.'ad:nt. <;«• iewi>rl< — < li <i 
'rkboo* or llbrarv Sock 



Coluan 2 

Keron* to used 
for 'jonic^rnji f Tie 
act ivitv-- Vqi ate 
for J*scs«ing 
la p loaenMt lr • 



') rVr *-ptual-not 



• I L»e m -sm | 



if ! lppin^ rhvtnr « ltl i W 
(m *roup) 

b5 oien balance perloo (Indi- 
vfvial. on Jungle gvt balancu 
St>a*. etc ) 



Col uim 3 
-ro^uenr* and 
' ir Itv of rect« 
. tjlleceirr—tu 
ientlv ropre*i 
: .v* to aaacs* 
| tar te«ent«tnr 



f f l - 
rnca 




Exanple continued. Ms. Gregory found that program teachers already 
planned to keep records of students* progress in "reading aloud"(Activ- 
ity 7a) and of their work with audio tapes in the "recorder comer** 
(2hj. further, this record collection as planned seemed to Ms. Gregory 
io give her exactly the implementation information she needed: teach- 
ers planned to monitor reading via a checksheet that would let them 
note the date of each student's reading session and the number 01 pages 
lead. 

Teachers also planned to note t!ie quality of itudent performance, a 
bit of data th M t Ms Gregory did not n ed. Work with cassettes in the 
recorder corner (2b) was to be noted on a special form by an aide, but 
only the progress of children with educational handicaps would be 
recoil :d. These audio corner records, Ms. Gregory decided, would not 
be adequate. She needed data on all children's use of the tapes. She 
noted the uselulncss of this information on her chart, with an addi- 
tional notation to speak to the staff about changing record-keeping in 
the recorder tenter to include at least a periodic random sample from 
the whole class. 




Y- - 

j t r« ad i 
il.U- (3 



Coluan 1 

< twit tea 



Co 1 uan 2 
Record 



Column 3 
c ollerr Ion 



n% aloud *ltb teacher 



b) readl 
rrrorder 
week) 



n% Mftseete work at 
rrnt«r (1 i Imrnf 



t**r* *r/aide'* r«*r « mntrant rerf»rdint 



iird book; e.lve* 
date* of n»c Jin 
no ol Jftft reid--^ 
adequite 

a!d»'« rrtordlng 
for*. % I vcit iKount 
of tiae. progress, 
dltc rar t Inn**- 
ad equate 



--adequate 



,nly nn EH chltdr« 
— inadequate : 
•peak with staff: 
could thay took 
all *tuJ«nt«J 



233 



BEST COn 



I:\ample continued. Ms Gregory nceded'some in'ori,— ,on r\>r whjch 
no records were planned, for instance, teachers anc .dev did not 
iniend to keep retards ol students' participation in *> « 1 ' motor 
nm." (Activity 3). Ms. Gregory noted thir and deterrr.i.. •„ n ~i 
with the stall to sugg, . some data collection 



in t nun*-- madequa:* 
*wgge»t that »id 
I <erp a checul t 
I dijry of t«ngU. and 
enrttonr of d*ll\ 
4 ssion<i 



i open 



nee p. no.l Undt- 
jn*W hvij. b.il.inci 



none-- inadequate 
aide diary* 

none — Inidrquatc 
aide aiarv' 



I unn ' 

'lectin 



Ms. Greeorv spukc with jides about the possibility of keeping a diary ol 
perceptual-motor activities Aides resisted this idea, they wanted the 
period to be relatively undirected and they saw it as a break for 
themselves from regular in-class record-keeping. They did, however, feel 
that it would be usciul to them to have a record of each student's 
progress in balancing and climbing. Ms, Gregory was thus able to 
persuade them to construct a checklist called GYM APPARATUS 1 
C AN USI , to be kept b> the students themselves and collected once a 
month. Ms. Grcuoiy decided to collect data on the "clapping" part or 
the pcrceptual-molo. period in some way other than by cxamimne 
records, perhaps vu j questionnaiK to aide, jt the end ol the year o^ 
through observations 



Lxamplo continued. Ms. Gregory was uccd with the responsibility ol 
j: radically single-handedly evaluating a comprehensive year-lone pro- 
zr.im As it turned out, Ms. Grcgor> was quite succcsstul at finding 
-ecords that would provtdc her with the implementation information 
<hc needed. The following records would be made mailable to her 

• The teachers" record books showing progress in read-aloud sessions 
° Aides' recording forms of students' recorder corner work 

• Students' GYM APPARATUS I CAN USE checklists 

Also avai'ible were other records lor teaching math, music, and basic 
science-topic areas not included in the example. All records wo Md be 
available to Ms, Gregory throughout the year. But how would she find 
time to extract data from them all? 

By means of a time sampling plan, Ms. Gregory could schedule her 
record collection and data tra ^ription to make the task manageable. 
Tint, she chote a time unit a p. jpria*- for analyzing the types of 
records she would use. The teachers' records of read aloud sessions, for 
e\ampic, should be analyzed in weekly units rather than daily units. 
According to Ms. Gregory's activities list, the program did not require 
students to read every day; they must read for the teacher at least three 
times per wetk. Perceptual-motor time could be analyzed by the day, 
however, since the program proposal specified a daily regimen. She then 
selected a random sample of u *ks from the time span of the program 
and arranged to examine program records at the various sites. She 
selected Jays for which gym apparatus progress sheets would be 
examined. 

Site and participant selection was random throughout. For each 
week of uata collection, she randomly chose four of the eight part** 
ipiting schools, and within them, two classes per grade whose records 
w uld be examined 



o 

ERIC 



234 



BEST COPY 



Example c-nlinued. Having sampled both time units and classroom;. 
Ms. Gregory consulted teachers' records from eight classrooms at each 
grade le*cl for the week of January 26. Once she had prepared a list of 
the 30 students in one of the third grade samples, she 

• Tallied the number or times each one read 

• Recorded Jic number of pages read 

• Calculated the mean number of pages read that week per student 

Ms. Gregory's data roster for gathering information on thud-grade 
read-aloud sessions from one teacher's record book looked like Table 7. 

TABLE 7 
Example of a Data Roster 
for Transferring Information 
From Program Records 





Individu-J ized 


Progran 




Class ' Vr. Roberts-- 3rd Hrade 


School \\ I ison Park 


Act ivity: Readine 
teacher or aidr 


aloud with 


Data source 
er's ret ord 


Teach- 

bnok 


Questions. How 
How 


often did children read per week 7 
many pages did they cover? 


Time L'nit * Week 


of 


Janua ry 26 






Stuuent 


Tallv of 
times stu- 
dent read 


No. of 
pages read 


Mean no. 
of pages 
read 


Adams, Oliver 




//// 4 


4. 5, 6, 5 


5 


Au 1 1 , Mo U v 




// 2 


3, 4 


3.5 


Caldwell, Maude 




/// 3 


4. 3, 5 


U 


Connors , Stephen 


4HT 5 


1. 6. 5, * 


U 


Ewell, Leo 




/// 3 


3, 5, 4 


U 


Coldwell , Nora 




-U+t 5 


6, 2, 3, 4, S 


U 


Gro** w Jovce 




// 2 


7, ft 


7.5 



o 

ERIC 



235 



BEST COPY 



Chapter 5 

METHODS FOR MEASURING PROGRAM IMPLEMENTATION: SELF-REPORTS 



Chapter k described ways in which evaluators can use program 
records to provide one type cf implementation information. 
Because records are for the most part written documents, however, 
the picture they E m l rp create may te incomplete, lacking the 
details that only those who experienced the program can provide. 
A good way to find out what a program actually looked like is to 
ask the people involved^ suk "The focus of this chapter, 
therefore, is 3elf-reports , the personal responses of program 
faculty, staff, administration, and participants. 

Self-reports typically take one of two forms: questionnaires 

and interviews. Questionnaires asking about different 

individuals 1 experiences with a program enable one evaluator to 

collect information efficiently from a large number of people. 

Individual or group interviews zre more time-consuming, but 

provide face-to-face descriptions and discussion or program 

experiences. Where there ;« a phn or theory the program, gathering 

information from st?ff vill involve questioning them about the consis- 
tency between program acidities as they were planneu and as they 
actually occurred. Where the program hju not been prescribed, informa- 
tion from people mected with it will mbmm how the program evolved. 

Whether they are questionnaires or inter ews, self-reports 
also differ on the dimension of time. They can consist either of 
periodic reports throughout the program or retrospective reports 

after the program has ended. 

Periodic reports will generally yield more accurate implementation infor- 
I 1** mation because they allow respondents to report about program activities 
soon after they have occurred, when they arc still fresh in memory. For 
thi* reason, they arc nearly always more credible than retrospective re- 
ports. Periodic reports should be used even when your role is summative 
and you arc required lo describe the program only once, at its conclusion. 

m 236 BEST COPY 



Retrospective self-reports should be used in only two cases: 

when there is no other choice (e.g., because the evaluation is 
commissioned near the program's conclusion) or when the program 
is small enough or of such short duration that reconstructions 
after-the-fact will be belies Able. What follows are step-by-step 
directions for collecting self-reports through periodic 
questionnaires or interviews. These can be adapted easily to 
ojraat 4 a retrospective report. 

Cov To Gather Periodic Self-Repeats Over the Course of the 
rogram 

'Step 1. Decide hov many times you will distribute 

questionnaires or conduct interviews and from whom you will 

^collect 8 elf -report 8 
A 

As soon as you begin working on the evaluation and as early as 
possible in the program's life, decide how often you will need to 
collect 3elf^report information. This decision will be 
determined by three factors: 

* The homogeneity of program activities . If each program unit 
has essentially the same format as the others, then you wilx not 
nee' to document descriptions of particular ones. If, for 
example, a company's program foi updating employees' knowledge in 
a technical field consists of standardized lessons containing A- 
lecture, reading" and class discussion, then any one lesson you 
ask about at any given site will reflect the typical format of 



237 BEST COPY 



£-3 



the program. In such a case you can plan data collection at your 
discretion. If, on the other hand, the program has certain 
unique features, say group project assignments that will vary 
from site to site ecial guest lectures by local university 

professors, you will want to ask about these dis tinguishiii^ 
program features as 3oon as they occur. This will give you a 
chance to digest information and provide immediate formative 
feedback to program planners and staff. 

• Your asxssmehT of peopleiQolerance for interruptions. Unless the 
program is sparsely staffed, you should not ask for more than three 
reports from any one individual over the span of a long-term pro- 
gram (e.g., a year). You «9&& sample, of course, so that the chances 
arc reduced that any one person will be asked to report often. 

• Hie amount of time you expect to liave available for scoring and 
interpreting m/ot station in reports. 

Once you have decided when to collect self-reports, create a 
sampling strategy (see pages 60 to 6k) by deciding whom you 
will ask for self-report information (both by title and by 
name) and how you will insure that various program sites are 
adequately represented. 



Step 2. Want people that you will be requesting periodic information 

As early during the evaluation as possible, inform staff members and 
others that in order to measure implementation of their program, you 
must ask that they provide you with information about how the program 
looks in operation 



zss pest copy 



ep 3. Construct a program characteristics list 



Procedures for listing the chaiacteristics of the program-materials activi- 
ties, administrative arrangements- that you will examine are discussed in 
Chapter $ pages 0 to<£§) 

Step 4. Decide-if you have not aiready-whether to distribute question- 
naires, to interview, or to do both 

You probably know about the relative adv^tages and disadvantages of 
using questionnaires or interviews. Tableg page@ reminds you of some 
of them. If you are using self-report instruments to supplement program 
description data from a more credible source-observations or records- 
then questionnaire data should be »ufficient. On the other hand, if self- 
report measures will provide your only implementation backup data, then 
you should interview some participants^ES&l/you are a clever question- 
naire writer, you probabh cannot find out all you need lo know about the 
program from a pencii-and-paper instrument^ 

and interviews allow a sensitive 
evaluator to come face to face with important program 
concepts and issues. 



Step 5. Write questions based on the list from Step 3 that will prompt 
people to tell you what , they saw and did as they participated in the 
program 



Anyone who writes questions or develops items on a regular 
basis would do well to consult the books listed at the end 
of the chapter as what can be presented here represents only 
a small part of available knowledge on how to do this well. 
The development of good items for questionnaires and 



BEST COP? 



ERIC 



s-<r 



interviews clearly combines art, science, common sense, and 
practice. What follows is a brief summary of things to 
consider when writing questionnaire or interview items. You 
should also review Table 8 for a list of pointers to follow 
when writing questions for a program implementation 
instrument . 

To begin, one thing you will need to know is how 
participants used the materials and eDgaged in the 
activities that comprised the program. To this end, you 
should ask about three topics: 



The occurrence, frequency, or duration of activities. Whether you 
collect frequency and duration information in addition to occur- 
rence will depend on the program. To describe a Jfcience £ab 
program, for instance, you would need merely to determine wheth- 
er the planned labs occurred at all -and in the correct sequence If, 
on the other hand, the program in question consisted of daily, 
45-rrunute English conversation drills, then you would need to 
know whether the activity occurred with the prescribed frequency 
and duration. 

The form the activities took. Gathering information on the form of 
the activities means asking about which students took part in the 
activities, which materials were used and how often, what activities 
looked like, and possibly where the> occurred. It will also be useful 
to check whether the form of the activities remained const ant or 
whether the activities changed from time to time orJsTC3ent to 
student ^ 

The amount of involvement of participant, „, these activities 
Besides knowng what activities occurred, you should make some 
check on the extent of interest and participation on the part of th- 
arget group-say the students. Even if activit.es were set up M Z 

learned H SChed r Ul !' *^ M% m °"' y bc cx P ected <° 2 
learned from them ,f they engaged the students' attent.on Were 

students ... a math tutoung program, for mstance. mostly W ork.n E 
on the presenbeu excrcscs. or were they conversing abouT™",! 
and dothes some of the time? Were students in an unsm.au ed 
period actually exploring the enr.cl.ment materials, or were they 
just do.ng their homework' Some of tins slippage .s inev.table in 
every pr0 g ratn (aJ ,„ a ,, ^ S t,ll ..t „ .mportan " 

evi° t : ng ,he CX,en ' ° f -"-'"volvemen, in the program you are 



ERIC 



BEST COPY 

240 



9 

ERIC 



/; you Humd i i * l*\ ^tTquvstionnaire. then you have j chouc of iwo 
question formats, a closed [selected) or open (constructed) response 
format. Ease of scoring and clear repoiung lead most cvaiuators io use 
closed-response questionnaires. On such s questionnaire, the icipondcnl is 
asked to check or otiierwise indicate j pie-provided answer to a specific 
question. Recording the answers involves a simple tally of response cate- 
gories chosen. On the open-response questionnaire, the respondent is asked 
'o write out a short answer to a more general question. The opeiwcsponsc 
fomiat has the advantage of allowing respondents to freely give informa- 
tion you had not anticipated, but it is time-consuming to score: and unless 
you have available a large number of readers, it is not practical for any but 
the smallest evaluations. Most questionnaires ask principaiiv closed-re- 
sponse questions, but add a few open-response options, these allow 
respondents to volunteer mfo.mation important to the evaluation but not 
specifically requested., • 

To demonstrate how different question types result in 
different information , Figures 13 , lb , and 15 present 
combinations of open- and closed- ondcd questions for 
collecting implementation information on the same program. 
Figure 13 is entirely open-ended; Figure Ik combines open- 
and closed-ended questions; and Figure 15 uses a 
closed-response format exclusively. While the data that 
would result from the questionnaire in Figure 15 would be 
easily analyzed, this ease is gained at the expense of the 
more detailed information that individual teachers o+it&l 1 ^ 
write in on the two other questionnaire formats. The 
appropriateness of the questionnaire items finally selected 
will depend both on the questions asked in the evaluation 
ard on the availability of evaluation st^ff to analyze 
open-response format items. In general, it is worth 

including at least one open-ended question on every 
questionnaire, whether or not the results will later be 
reported. Giving people an opportunity to write down their 
concerns alerts them to the importance of their perspective 
and provides the evaluation helpful information for guiding 
l a» er activities. 



241 BEST COPY 



Like questionnaires, interviews can also take several 
forms, again depending on how questions are asked. 
Interviews can range from informal personal conversations 
with program personnel at one extreme to highly quantitative 
interviews that consist of a respondent and an evaluator 

completing a closed-response format questionnaire together 
at the other extreme. (Because this quantitative interview 
format doesn't take advantage of the face-to-face 
interaction of evaluator with respondent, it is more 
properly considered the enactment of a questionnaire, than 
an interview, ) 

Fog— m o st 1 nte iLVJLeys^ ^ basic Distinction can be made 

In 

between that are structured and those that are 

unstructured. In a structured interview, an evaluator asks 
specific questions in a pre-spec if led order. Neither the 
questions nor their order is varied across interviewers, and 
in its purest form the interviewer's job is merely to ask 
the predetermined questions and to record tfce responses. In 
cases where an evaluator already has ideas about how 
program looked, structured interviews can provide 

A 

corroboration and supporting data, 

By contrast , ^ 
an unstructured interview can explore areas of 
implementation that were unplanned or that evolved 
differently from the plan. j„ an unstructured interview the 
evaluator poses a few general questions and then encourages 



242 BEST COPY 



respondent to amplify ki* answers. The unstructured 
interview is more like a corversation and does not 
necessarily follow a specific question sequence. 

Unstructured interviews require considerable 
interviewing skill. General questions for the unstructured 
interview can be phrased in several ways. Consider the 
following questions : 

• How often, how many rimes, or hours a week did the program (or its 
major features) occur 7 

• Wnat can you tell me about how the activities actually looked- can 
von recall an instance and describe to me exactly what uvut on 7 

• How involved did the students seem to he-did ail student* parties 
pate, or were there some students who were always absent or 
distracted 7 

• / understand thai you an attempting to implement a behavior 
modification, or open classroom, or values clarification program 
here. What kinds of classroom activities have been suggested to km 
bv this point of new 0 

Since unstructured interviews resemble conversations and can 
easily go off track, they require not only that you compose 
a few questions to stimulate talk, but also that you write 
and use probes . Probes are short comments to stimulate the 
respondent to say or remember more and to guide the 
interview toward relevant topics. Two frequently used 
probes are the following: 

Can you tell me more about that? 

Why do you think that happened? 

There is no set format for probes. In fact, j good way of probing to 
gjin more complete information from respondents who have forgotten or 
icfi something out of their answer might be a simple: 

/ see, is there anything else 7 

nZ*H ?.?h in&eft Pr0bCS WhCn<?VCr thc rcs P° ndcnl m*" a strong state _ sx^\7 

SU'^L™ " PK,cd 01 au — d "«"°" F - »«• ■ BEST COPY 

Oh. yes. Participation, student involvement was very high - 1 00% 

° 241 
ERIC 



The best probe for such a strong response is a simple rephrasing and 
repetition 

Your statement t\ ihat eicrv student participated lOOVcof the time ' 
- Tins probe leads the respondent to reconsider. 

Step 6. Assemble the questionnaire or interview instrument 

Arrange questions in a logical order. Do not ask questions that jump from 
one subject to another. 

Compose an introduction. The introduction honors the respondents' 
nght to know why they are being questioned. Questionnaire instructions 
should be specific and unambiguous. Aim for as simple a format as 
possible. You should assume that a poipon of the respondents will ignore 
instructions altogether. If you feel the f ormat might be contusing, include 
a conspicuous sample item at the beginning. Instructions for a mailed 
questionnaire should mention a deadline for as return, <lnd y*>J sl^.j 



Instructions *br an interv»ew can^ be more detailed, of course, and 
should include reassurances to M u ft thVrcspondcnt's initial apprehension 
about being questioned. Specifically, the interwcwcr should' 

• State the purpose o] the mtemew. Explain what organization you 
rcpresentyand why you are conducting the cvaluatici. Explain the 
purpose of the interview. Describe the report you wiL have to make 
regarding the activities that occurred in the program, explain if 
possible how the 'nfcmation the respondent eivcs vou might affect 

• ♦ (he respondent statements can he kept conJnientiaU*aXtt In 
situations where a social or professional threat to the respondent 
ma\ be involved, confidentiality of interviews must be Mrcsscd and 
maintained. 

• Explain to the respondent what will he expected during the inter 
wci • For instance, if it will he necessary for the respondent to go 
back to the classroom to get records, explain the necessity of this 
action. 

Sonic ol the above information should probably be made available to 
questionnaire respondents as well. This can be do/.c by including a cover 
letter with the questionnaire 

Step 7. Try out the instrument 

Before administering or distributing any instrument, check it out Give it 
to one or two people to read aloud, and observe their responses. Have the 
• ople explain to you their understanding of what each question is asking. 
If the questions arc not interpreted as you intended, alter them accord- 
ingly. 

Always rehearse the interviews. Whether you choose to prepare a 
structured or unstructured interview, once the questions for the interview 
arc selected, the interview should be rehearsed. You/Tmd other intcrview- 
cs^&hculd run through it once or twice with whoever is available a *mb£*** : f> *-*S-<^ 



BEST COPY 



»- hus b and , an older child, a secretary This dry-run is a test of both the 
instrument and the interviewer. Look for inconsistency in the logic of the 
question sequencing and difficult or threat* -nngly worded questions. Ad- 
vise the person who is playing the role of respondent to be as uncoopera- 
tive as possibL m prepare interviewers for unanticipated answers and *ver. 
hostility 

Step 8. Administer the instrument according to the sampling plan from 
Step 1 

If you mail questionnaires . give respondents about two weeks to return 
them. Then follow up with a reminder, a second mailing, or a phone call if 
possible. How do you do such a follow-up if people are to respond 
anonymously* One procedure is 10 number the return envelopes, check 
them off a master list as they arc returned, remove the questionnaires from 
the envelopes, and throw the envelopes away. 

When distributing any instrument, ask administrators to lend their 
support. If the instrument carries the sanction of the project director or 
the school principal, it is more likely to icccivc the attention of thox> 
involved. The superintendent's request ror quick returns will carry more 
authority than yours. 

If you interview, consider the following suggestions. 

• Interviewers should be aware of their influence over what respon- 
dents say. Questions about the admmist ration of the program inav 
be answered defensively if staff members fear their answers might 
make them look bad in a report. Explain to the respondents that the 
report will refer to no one personally. Understand, as well, that 
respondents will speak more candidly to interviewers whom they 
perceive as being like themselves-not representatives of authority. 

• Interviewers should have a plan for dealing with reluctant respon- 
dents. Hie best way to overcome resistance is to be explicit about 
the interview and what it will demand of the respondent. 

• If possible, intcrvicwjsh ^ld h e recorded ^on audiotape to he tran- 
scribed at a later Umc &Tarticularly unstructured 6nc?Q Recorded 
JUtcrvmw^cnablc you to summarize the ihformaUOA using exact 
quotriTrom the respondent; they also icquirc alot of tranyriptiop 

time. Transcribing the tape in full wili til" m IttilTiiilf npimi mifi i**™* '< 

as the interview itself. An alternative is that interviewers take notes 

during an unstructured interview. Notes should include a general 

summary of each response, with key phrases recorded verbatim. If 

possible, summaries of unstructured interviews should be returned to 

respondents so that misunderstandings in the transcription can be 

corrected. 



Step 9. Record data from questionnaires and interview instruments on a 
data summary sheet 

Chapter 3, page 67, described the use of a data summary sheet for 
recording data from many forms in one place in preparation for data 
summary and analysis. Data Jrom closed-response items on questionnaires 
and structured interview rtBOZ can be transferred directly to the data 
summary sheet. Responses to open-rcsponse items and unstructured inter- 
views will have to be summarized before they can be further interpreted 
Procedures for reducing a large amount of narrative information by cither 
summarizing or quantifying it were discussed on pages^to<S> Even if 
you plan to write a narrative report of your results, til? data summary 
sheet will show trends in the data that can be described in the narrative. 

ERiC 245 



BEST COPY 



TABLE 8 

Some Principles To Follow When Writing Questions For An 
Instrument To Describe Program Implementation 



To ensure usable responses to implementation questions: 

1 When possible, ask aboui specific -and recent-e v ents or time periods such as 
toda\ i math lesson, Thursday's Jield trip, last week. This persuades people to 
think concretely about information that should still be fresh in memory. To 
alleviate \o\ir own and the respondent's concern about representativeness of the 
event, ask for an estimate, and perhaps an explanation, of its typicality. 

or *>%k.pr—Qr~~> V f*'* t^t*f A*,/j4- 

2. When askmg a closed-response question, try vb imagine whai could have gone 
wrong with the activities that were planned AJse these possibilities as response 
alternatives. Resourceful anticipation of likel\ activity changes will affect the 
usefulness of the instrumcni for uncovering changes that did indeed occur. If 
\ou feel that you cannot adequately anticipate discrepancies between planned 
and actual activities, then add "other" as a response alternative and ask 
respondents to explain 



3 Be sure that you do not answer the question by the way you .j$k it A *ood 
question about what people did should not contain a sugpestiort about how to 
answer, f or instance, questions such as "Were th ere 4 < W*wiit $ \k'£5&t* in the 
program r ' or "Did vou meet every Monda, itternoon?" suggest information you 
should receive iruin the respondent. Kather., thesctiuestions should *>e phrased, 
"What were thc44W*1eveTs < oT the std i C wts in'tfle program *>" "What dcys o( the 
week and how rcguhriy did you meet?" 

4 Identity the irainc nt re I ere nee ol the respondents. In an interview*, you can learn 
a great deal from how a person responds as well as from what he says; but when 
>ou use a questionnaire, your information will be limited to written responses. 
The phrasing of the questions will therefore be critical. Ask yourself: 

• What vocabulary would be appropriate to use with this group 7 

• //oh welt informed are the respondents ttkety to be 7 Sometimes people are 
perfectly willing to respond to a questionnaire, even when they know little 
about the subject. They leel they »se supposed to know, otherwise you would 
not be asking them. To allow people to express ignorance gracefully, you 
might include lack of knowledge as a response alternative. Word the alternative 
so that it does not demean the respondent, for instance. "I have not given 
much thought to this matter." 

• Does the group have i particular perspective that must he taken into 
account-a particula' bias 7 Try to see the issue through the eyes of the 
respondents bclore vou begin to ask the questions 



246 BEST COPY 



DOCU . STATION QUESTIONNAIRE 
Peer-Tutoring Program 

Th* following are questions are the peer-tutor i' g program 
The following are q interested in k. wing your 

Information on the back f this questionnaire. 

1. How was the peer-tutoring program structured in your 
classroo i? 



2. How were tutor* selected? 



3. K>w .ere students selected for tutoring^ 



k. What materials seemed to work best in the peer-tutoring 
sessions? Why? 



5. What were the strengths of the peer-tutoring program this 
year? 



, ? 



6. What changes woulcf you make to improve the program next year 

-- 1 



ERIC 



24 y 



BEST COPY 



S-/ 3 



DOCUMENTATION QUESTIONNAIRE 
Peer-Tutoring Program 

The following are statements about the peer- tut or Ing program Imple- 
mented th*s year. Wo are interested In knowing whether they represent 
an accurate statement of what the program looked like in operation. 
For this reason, we ask that you Indicate, using the I to 5 scale 
after e u-h statement, whether it was "generally true," etc. Please 
circle vour answer. If vou answer seldom or never true, please use 
tho lines under the .statement to correct its Inaccuracy. 



pener- 

llways .illy 
true true 



seldcT 
tnje 



never 
true 



don t 
know 



Stucents were tuccreu three 
t iraes .. -eck for periods «>f 
4 5 -mutes each. 



Tutoring took place in the 

1 asj>~-oom, tetors working 
witn their ovn c lassma tes . 



Tutors were t.it* 
readers. 



fast 



There were no discipline 
prob 1 ems . 



1 



Students were selected for 
tuL t >ring on the hasis of 
n iding grades. 



S. Tutoring used the "Road and 1 
Sav" workbooks 



rigure questtoitnaire ^boul pr og r am actmtnrr 

•A^j uses both closed and open response formats. 



9 

ERIC 



248 



best copy 



■K9HM 



DOCUMENTATION QUFSTIONNAIRE 
Peer- Tutoring Program 

Please answer Che following questions by placing rhe letter of the 
most accurate response on the line to the left of the question. We 
arc Interested in finding out what the project looked like in opera- 
tion during the past week , regardless of how it was planned to look. 
If mora than one answer is true, answer with as many letters as you 
need. 

*• °° the average, how many times did tutoring sessions take 

place in vour classroom? 

*) never c) 3 or 4 times 

b) 1 or 2 tines d) 5 nr more times 

2 - What was the average length a tutoring seeslor 7 

a) 5-1 S minutes O 25-4S minutes 

b) 15-25 minutes d) longer than 45 minutes 

3. Where in the school did tutoring usuallv tnkc place 7 

a) classroom c) i~ orary 

b) sometimes clas -oo«., d) room other than classroom 
sometimes other room or library 

_____ 4. Wna were the tutors? 

a) only fast students c) only average students 

b) fast students and some d) other 
average students 

5. Oa what basic were tutees selected 7 

a) reading achievement c) general grade average 

b) teacher rt-tonssendat Ions J) other 

6 - What materials were used by teachrrs and tutors? 

a) whatever tutors chose c) "Read and Say" workbooks 

b) specially ionstructed d) other 
games 

7 - Nov typical of the program as a whole was last ek, as you 

have described it .ere 7 

a) Just the same c) some aspects not typical 

b) elmos' the same; d) not- typical at all 



Figure 1 5. Example of a closed response questionnaire 



Outline for Chapter 6 

METHODS FOR MEASURING PROGRAM IMPLEMENTATION : OBSERVATIONS 

A. Introduction- Setting Up an Observation System 

x. Range of possibilities in observation/participant 
obs er vat ion "systems 11 

a* Informal* casual, seat of the panta 

b. More credible "scienti fic "/systematic approaches 

rangiE* along concinuum from highly prestructured to 

those Dased on emerging information 

1 ) Quantitative, highly structured, predetermined 
categories , "research" 

2) Qualitative, participant observation, categories 
emerge from analysis of field notes, move back 
and forth from data to analysis 

2. Note limitation when people think they're doing 

qualitative study when in fact they're using a casual, 
unsystematic approach; cite Patton's kit book 

B. Making Quantitative Observations — Steps 1-12 (pp. 90-112); 
plus Stallings Observation System (pp. 112-115) — editorial 
changes only 

C. Making Qualitative Observations (Each step will be 
elaborated; examples added as necessary for clarification) 

1. Step 1. Construct a program characteristics list 
describing what the program should lcok like 

2. Step 2. Make initial contact with program personnel, 
conduct initial observations, establish entree and 
rapport, inform program staff about 
participant/observat ion 

3. Step 3. Develop an evaluation timeline based on analysis 
of your initial information, prepare a "sampling plan 

for observations, decide hew ma~h time can be spent doing 
observations, if possible, write out the program "theory 

J*. Step k. Assemble the evaluation team (people familiar 
with naturalistic methods), decide on appropriate format 
for fieldnotes, discuss evaluation context, initial 
findings," critical issues; arrange analysis scheuule 

5. Step 5. Move back and forth between collecting data 
( from participant/obsei vat ion, interviews , 
questionnaires, i.e., whatever data collection techniques 
are appropriate) AND analyzing data 

a. Do this until ,ou have sufficient information to 

answer the users 1 questions (or you run out of time) 



250 



Chapter 6 outline, page 2 

b. Part of process is a series of meetings to di; cuss 
themes , issues , critical incidents ; these can involve 
evaluators and program personnel as appropriate 

c. "Thick description should be written during the 
process; ongoing evolution of written description of 
program; should be given to program people for 
reaction, then revised as need be 

6. Step 6» When all data are in, prepere them for 
interpretation and final presentation 

D. Chapter Summary 

*1. Importance of observation techniques 
2. Selection of appropriate level of "rigor" in observations 
as well as appropriate type 

E. (Updated) For Further Reading (many texts now available) 



ERIC 



25 1 



Appendix 

AN OUTLINE OF AN IMPLEMENTATION REPORT 

The outline in this appendix will yield a report describing 
program implementation only » In most evaluations, implementation 
issues comprise only one facet of a mere elaborate enterprise 
concerned with the design of the evaluation, the intended 
outcomes of the program, the measures used to assess achievement 
of those outcomes, and the results these maasures produced. If 
this description of an extended evaluation responsibility matches 
your task, then you will need to incorporate information from the 
outline here into a larger report discussing other aspects of the 
program xxd its evaluation. If, in fact, the evaluation compares 
the effec of two different programs considered equally 
important by your audience, then you should prepare an 
implementation report to describe them both. 

The headings in this appendix are organized according to the 



five major sections of an implementation report: 

1. A summary that gives the reader a quick synopsis of the 
report 

2. A description of the context in *hich the program has been 
implemented, focusing mainly on the setting, administrative 
arrangements, personnel, and resources involved 

3. A description of the point of view from which implementation 
has been examined. This section can have one of two 

:h rac ters : 



ERLC 




Appendix, Page 2 

a. It can describe the program's most critical features as 
prescribed by a program plan, a theory or teaching model, 
or someone's predictions about what will make the program 
succeed or fail, or 

b. It can explain the qualitative evaluator's choice not to 
use a prescription to guide her examination of the 
program. 

A description of the implementation evaluation itself — the 
choice of measures, the range of program activities examined, 
the sites examined, and so forth. This section also includes 
a rationale for choosing the data sources listed. 
Results of implementation backup measures and discussions of 
program implementation. This section can do one of two 
things : 

a. Describe the extent to which the program as implemented 
fit the one that was planned or prescribed by a plan, 
theory, or teaching model 

b. Describe implementation independent of underlying intent. 
This description, usually gathered using a naturalistic 
method, reflects a decisions that the evaluator describe 
what she discovered rather than compare program events 
with underlying points of view. 

In either case, this section describes what has been found, 

noting variations in the program across Rites or time. 
Interpretation of results, commendations, and suggestions for 
further prograja development or evaluation. 




Appendix, Page 3 



Report Section 1 • Summary 

The summary is a brief overview of the report, explaining vhy 
a description of implementation has been undertaken and 
listing the major conclusions and recommendations to be found 
in Section 6 . Since the summary is des igned for people wno 
are too busy to read the full report, it should be limited to 
one or two pages, maximum. Although the summary is placed 
first in the report, it is the last section to be written. 

Report Section 2. Background and Context of the Program 

This section sets the program in context. It describes how 
the program was initiated, what it was supposed to dc, and 
the resources available. The amount of information presented 
will depend upon the audience3 for whom the report has been 
prepared. If the audience has no knowledge of the program, 
the program must be fully described. If, on the other hand, 
the implementation report is mainly intended for internal use 
and its readers are likely to be familiar with the program, 
this section can be brief and L.et down information "for the 
record." Regardless of the audience, if your report will be 
written, it might become the only lasting record of the 
program's implementation. In this case, the context section 
should contain considerable data. 
<fl If your program's setting include* many different schools or 
districts , it* may not be practical to cover every evaluation 



ERLC 



254 



Appendix, Page k 
issue separately for each school or program site. Instead, 
for each issue indicate similarities and differences among 
schools or s ites or the range represented or the most typical 
pattern that occ irred. 

Report Section 3 • General Description of the 

Critical Features of the Program as Planned — 
Materials and Activities 



255 



